Closed ShikanZ closed 4 years ago
Do you have the full log?
@ShikanZ I need the terminal log produced by taiji. If you did not record that, you can run the program with the same command again. It will not rerun the finished steps and it should reproduce any error message.
Those are the errors. I don't need the RSEM indices. Generating RSEM indices [ERROR][05-29 11:49] RNA_Make_Index(b29a..) Failed: Ran commands: mkdir -p /mnt/Data/Cui_lab/Shen/taiji_output/GENOME/RSEM_index rsem-prepare-reference --gtf /mnt/Data/references/GRCm38/Sequence/Annotation/Genes/genes.gtf /mnt/Data/references/GRCm38/Sequence/STARIndex/genome.fa /mnt/Data/Cui_lab/Shen/taiji_output/GENOME/RSEM_index/MM10 which rsem-prepare-reference Exception: user error (shelly did not find rsem-prepare-reference in the PATH: /home/shikan/miniconda3/bin:/home/shikan/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin) CallStack (from HasCallStack): error, called at src/Control/Workflow/Interpreter/Exec.hs:134:37 in SciFlow-0.7.2-Jc8TJcu7aUL61DWlZpDMFY:Control.Workflow.Interpreter.Exec
[ERROR][05-29 14:10] ATAC_Make_Expr_Table(0509..) Failed: Prelude.tail: empty list CallStack (from HasCallStack): error, called at src/Control/Workflow/Interpreter/Exec.hs:134:37 in SciFlow-0.7.2-Jc8TJcu7aUL61DWlZpDMFY:Control.Workflow.Interpreter.Exec
[ERROR][05-29 16:33] Program exits with errors
RSEM is needed for RNA quantification. If you have processed RNA-seq by yourself, you can input those quantifications directly.
I used salmon for RNA quantification. How do I input those files? Instead of inputing RNA fastq files, I will put quant files in the input.yml?
And also what about the second and third errors?
Yes, see here: https://taiji-pipeline.github.io/documentation/input.html#rna-seq.
If you are using TSV input file, then put:
RNA-seq id group_name 1 filepath GeneQuant
For the second error, go to the GeneQuant
directory and see if those files are empty.
GeneQuant has 4 files but they are empty
What annotation file are you using?
GTF file from Ensembl.
Please check if you have "gene" feature in the 3rd column.
No it doesn't have "gene" feature in the 3rd column
You need to use the full annotation with "gene", "transcript", etc. The one you used is probably for transcriptome analysis.
After updating the annotation file, run: taiji delete ATAC_Compute_TE ATAC_Gene_Count --delete-all
to update the cache and then run the program again.
I'm sorry. I think it has the gene feature
I didn't see any "gene" in the 3rd column. Also please note that ENSEMBL use "1,2,3..." to name chromosomes. Please make sure you use the same chromosome naming scheme during the rest of your analysis.
Example from the ensembl website:
1 transcribed_unprocessed_pseudogene gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";
I have got this new error.
[ERROR][06-04 01:23] RNA_Make_Expr_Table(39aa..) Failed: readDouble: Fail to cast ByteString to Double:"Length" CallStack (from HasCallStack): error, called at src/Bio/Utils/Misc.hs:24:14 in bioinformatics-toolkit-0.9.3.1-CNRcUDPNpO01sIdCsENaFx:Bio.Utils.Misc CallStack (from HasCallStack): error, called at src/Control/Workflow/Interpreter/Exec.hs:134:37 in SciFlow-0.7.2-Jc8TJcu7aUL61DWlZpDMFY:Control.Workflow.Interpreter.Exec
Do not put anything additional to the expression file. Just the gene and expression value:
Gene1 <TAB> 12
Gene2 <TAB> 20
Gene3 <TAB> 25
And also I'm using fastq files to do the analysis. Here is the error. [ERROR][06-04 08:32] ATAC_QC(df4d..) Failed: Statistics.Matrix.fromRows: zero columns in matrix CallStack (from HasCallStack): error, called at src/Statistics/Matrix.hs:93:23 in dense-linear-algebra-0.1.0.0-K4G29RdkqcAAh5dSEMcmWb:Statistics.Matrix CallStack (from HasCallStack): error, called at src/Control/Workflow/Interpreter/Exec.hs:134:37 in SciFlow-0.7.2-Jc8TJcu7aUL61DWlZpDMFY:Control.Workflow.Interpreter.Exec
Please check the sizes of bed files and peak files to see if there is anything wrong.
They are 0 byte.
You can run samtools flagstat
on the bam files to see if there was something wrong during the alignment.
I looked at couple of them, They are pretty much look like this
79726438 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 67590366 + 0 mapped (84.78% : N/A) 79726438 + 0 paired in sequencing 39863219 + 0 read1 39863219 + 0 read2 0 + 0 properly paired (0.00% : N/A) 62821202 + 0 with itself and mate mapped 4769164 + 0 singletons (5.98% : N/A) 7078490 + 0 with mate mapped to a different chr 1240900 + 0 with mate mapped to a different chr (mapQ>=5)
Just to confirm: the files in the BED
directory are also empty? If so how about the files in the Bam
directory? Are there any close to empty file?
Bam files are normal, but Bed files are empty.
That's weird. I'm not sure what's going on. Maybe try rerunning the Bed conversion:
taiji delete ATAC_Bam_To_Bed
taiji run --config config.yml --select ATAC_Bam_To_Bed
See if the bed files can be produced.
It gives me this "Parts of the program finish successfully" right away.
What is the result of ls -lh
on the directory containing bam files.
It seems that all reads were filtered out. And I saw a problem from the flagstat output:
79726438 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 67590366 + 0 mapped (84.78% : N/A) 79726438 + 0 paired in sequencing 39863219 + 0 read1 39863219 + 0 read2 0 + 0 properly paired (0.00% : N/A) 62821202 + 0 with itself and mate mapped 4769164 + 0 singletons (5.98% : N/A) 7078490 + 0 with mate mapped to a different chr 1240900 + 0 with mate mapped to a different chr (mapQ>=5)
The reads are not properly paired. Could you post the input file?
RNA-seq:
id: CD8_TCM group: TCM replicates:
id: CD8_TEM group: TEM replicates:
id: CD8_TRM group: TRM replicates:
ATAC-seq:
id: ATAC_CD8_TCM group: TCM replicates:
id: ATAC_CD8_TEM group: TEM replicates:
id: ATAC_CD8_TRM group: TRM replicates:
The input looks good. I'm not sure why the reads are not properly paired. The alignment was created by bwa mem -M -k 32
.
I also create another input.yml file. Can you check if there is anything incorrect?
RNA-seq:
id: CD8_TCM group: TCM replicates:
id: CD8_TEM group: TEM replicates:
id: CD8_TRM group: TRM replicates:
ATAC-seq:
Looks good to me!
But make sure the reads in bam files are properly paired!
The pipeline finished successfully, but I found that in the RNAseq(input fastq files) folder there are two samples are missing. I have 3 groups, each group having 3 replicates. However, one group only has the rep 3 in the folder, rep 1 and rep2 are missing.
I think I found out what happened. I've made a mistake in the input.yml.
Hi I am recently runing Taiji pipeline, intergrated analysis of RNAseq and ATACseq. However, it gives me this error message "Program exits with errors". I have no idea what the errors are. Right now the output file has ATAC folder and RNA folder. RNA folder is empty, while ATAC folder has Bam, Bed, BigWig, GeneQuant, Peaks, QC, and TFBS.
Thanks,
Shikan