Taiji-pipeline / Taiji

All-in-one analysis pipeline
https://taiji-pipeline.github.io/
BSD 3-Clause "New" or "Revised" License
33 stars 9 forks source link

RNAalign/STAR issue #11

Closed maeleck closed 3 years ago

maeleck commented 3 years ago

Hi,

While I am excited about the software and thankful for your work, I am having some issue. I was able to download some ATAC and RNA seq data by SRA through Taiji's input yml configuration. And I was able to configure the dependency files such as macs2 and python on my workplace server. The pipeline was able to process all the way to output where I get qc.html, but there were some hiccups. I had to manually type in STAR command line after Taiji stopped at RNA_align step (error below). So I tried different versions of STAR (2.7, 2.6 2.5.3, and 2.4) but I am still getting this same error except that it actually says 2.7 isn't compatible. Then I resumed the Taiji. It seemed to skip this and was able to output some results but it didn't output the GeneRank or some kind of TF network file. I am wondering what is the problem.

RNA_Align(df9a..) Failed: Ran commands: STAR --readFilesIn TAIJIoligooutput//RNASeq/Download/SRR3070188.fastq.gz --genomeDir /mnt/genomes/Mus_musculus/UCSC_mm10/STARIndex/ --outFileNamePrefix ./STAR_align_tmp_dir.-a77e7c05d6e95a0c/ --runThreadN 4 --genomeLoad NoSharedMemory --outFilterType BySJout --outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --outSAMunmapped Within --outSAMattributes NH HI AS NM MD --outSAMheaderCommentFile COfile.txt --outSAMheaderHD @HD VN:1.4 SO:coordinate --sjdbScore 1 --readFilesCommand zcat --outSAMtype BAM Unsorted --outSAMstrandField intronMotif --quantMode TranscriptomeSAM which STAR

Exception: error running: STAR --readFilesIn TAIJIoligooutput//RNASeq/Download/SRR3070188.fastq.gz --genomeDir /mnt/genomes/Mus_musculus/UCSC_mm10/STARIndex/ --outFileNamePrefix ./STAR_align_tmp_dir.-a77e7c05d6e95a0c/ --runThreadN 4 --genomeLoad NoSharedMemory --outFilterType BySJout --outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --outSAMunmapped Within --outSAMattributes NH HI AS NM MD --outSAMheaderCommentFile COfile.txt --outSAMheaderHD @HD VN:1.4 SO:coordinate --sjdbScore 1 --readFilesCommand zcat --outSAMtype BAM Unsorted --outSAMstrandField intronMotif --quantMode TranscriptomeSAM exit status: -9 stderr: CallStack (from HasCallStack): error, called at src/Control/Workflow/Interpreter/Exec.hs:145:37 in SciFlow-0.7.2-Jc8TJcu7aUL61DWlZpDMFY:Control.Workflow.Interpreter.Exec

kaizhang commented 3 years ago

STAR needs a lot of memory (~50G for human genome). The error is most likely due to lack of enough memory.

maeleck commented 3 years ago

Finally fixed the issue. I am not sure if I can ask a different question here but I am wondering if Taiji can take in ChIP seq raw data instead of ATAC. I kind of assumed that it does since I think ChIP seq and ATAC seq raw data are really just FASTQ files.

kaizhang commented 3 years ago

Yes, it can.