faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

Trinity not working: IOError: Neither Trinity.fasta nor trinity.log were found in output. #169

Closed claudiavaga closed 4 years ago

claudiavaga commented 4 years ago

Hi! I am having probelms running Trinity v2.1.1 in both Pyhluce 1.6.2 and 1.6.5 with the tutorial files The error I am getting is this: "Traceback (most recent call last): File "/home/claudiavaga/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 362, in main() File "/home/claudiavaga/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 341, in main cleanup_trinity_assembly_folder(output, log) File "/home/claudiavaga/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 283, in cleanup_trinity_assembly_folder raise IOError("Neither Trinity.fasta nor trinity.log were found in output.") IOError: Neither Trinity.fasta nor trinity.log were found in output."

and in the trinity.log file I find this: Trinity version: v2.1.1 ** NOTE: Latest version of Trinity is Trinity-v2.8.5, and can be obtained at: https://github.com/trinityrnaseq/trinityrnaseq/releases

Monday, September 30, 2019: 16:44:02 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/claudiavaga/miniconda2/envs/phyluce/opt/trinity-2.1.1/util/support_scripts/Exit Tester.jar 0 Monday, September 30, 2019: 16:44:03 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/claudiavaga/miniconda2/envs/phyluce/opt/trinity-2.1.1/util/support_scripts/Exit Tester.jar 1 Monday, September 30, 2019: 16:44:03 CMD: mkdir -p /home/claudiavaga/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/chrysalis


-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------

Converting input files. (in parallel)Monday, September 30, 2019: 16:44:03 CMD: gunzip -c /home/claudiavaga/uce-tutorial/trinity-assemblies/alligatormississippiensis trinity/alligator_mississippiensis-READ1.fastq.gz | fastool --illumina-trinity --to-fasta >> left.fa 2> /home/claudiavaga/uce-tutorial/trinity-assemblies/alligator_mississi ppiensis_trinity/alligator_mississippiensis-READ1.fastq.gz.readcount Monday, September 30, 2019: 16:44:03 CMD: gunzip -c /home/claudiavaga/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ2 .fastq.gz | fastool --illumina-trinity --to-fasta >> right.fa 2> /home/claudiavaga/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippie nsis-READ2.fastq.gz.readcount

gzip: stdout: Broken pipe Thread 1 terminated abnormally: Error, counts of reads in FQ: 1705959 (as per gunzip -c /home/claudiavaga/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity /alligator_mississippiensis-READ1.fastq.gz | wc -l) doesn't match fastool's report of FA records: 1655072 at /home/claudiavaga/miniconda2/envs/phyluce/bin/Trinity line 306 0 thread 1. main::ensure_complete_FQtoFA_conversion("gunzip -c /home/claudiavaga/uce-tutorial/trinity-assemblies/a"..., "/home/claudiavaga/uce-tutorial/trinity-assemblies/allig ator_m"...) called at /home/claudiavaga/miniconda2/envs/phyluce/bin/Trinity line 2099 thread 1 main::prep_seqs(ARRAY(0x7ffff17cae10), "fq", "left", undef) called at /home/claudiavaga/miniconda2/envs/phyluce/bin/Trinity line 1310 thread 1 eval {...} called at /home/claudiavaga/miniconda2/envs/phyluce/bin/Trinity line 1310 thread 1 -conversion of 1573403 from FQ to FA format succeeded. Trinity run failed. Must investigate error above.

I tried both versions of Phyluce 1.6.2 and 1.6.5 and I am getting the same error. Someone know how to solve it? Thank you!

brantfaircloth commented 4 years ago

This is a problem that has started to arise in Trinity. Try one of the other assemblers, like spades.

NataliaCD commented 3 years ago

Hello Dr. Faircloth, I am having the same issue with Trinity, and I am wondering whether there is a solution available now, or I just need to use another assembler or try something else to make Trinity work. Here is the error I am getting:

(phyluce) ncortes@vortex:~/Novogene_Data$ phyluce_assembly_assemblo_trinity --conf assembly_Afrater.conf --output trinity-assemblies --clean --cores 12 2020-12-08 18:47:25,418 - phyluce_assembly_assemblo_trinity - INFO - =========== Starting phyluce_assembly_assemblo_trinity ========== 2020-12-08 18:47:25,418 - phyluce_assembly_assemblo_trinity - INFO - Version: git 185b705 2020-12-08 18:47:25,418 - phyluce_assembly_assemblo_trinity - INFO - Argument --clean: True 2020-12-08 18:47:25,418 - phyluce_assembly_assemblo_trinity - INFO - Argument --config: /home/ncortes/Novogene_Data/assembly_Afrater.conf 2020-12-08 18:47:25,418 - phyluce_assembly_assemblo_trinity - INFO - Argument --cores: 12 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --dir: None 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --log_path: None 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --min_kmer_coverage: 2 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --output: /home/ncortes/Novogene_Data/trinity-assemblies 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --subfolder: 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Argument --verbosity: INFO 2020-12-08 18:47:25,419 - phyluce_assembly_assemblo_trinity - INFO - Getting input filenames and creating output directories 2020-12-08 18:47:25,422 - phyluce_assembly_assemblo_trinity - INFO - ------------------ Processing MZ168894_Afrater ------------------ 2020-12-08 18:47:25,422 - phyluce_assembly_assemblo_trinity - INFO - Finding fastq/fasta files 2020-12-08 18:47:25,424 - phyluce_assembly_assemblo_trinity - INFO - File type is fastq 2020-12-08 18:47:25,426 - phyluce_assembly_assemblo_trinity - INFO - Copying raw read data to /home/ncortes/Novogene_Data/trinity-assemblies/MZ168894_Afrater_trinity 2020-12-08 18:47:25,877 - phyluce_assembly_assemblo_trinity - INFO - Combining singleton reads with R1 data 2020-12-08 18:47:25,902 - phyluce_assembly_assemblo_trinity - INFO - Running Trinity.pl for PE data 2020-12-08 18:47:41,009 - phyluce_assembly_assemblo_trinity - WARNING - Did not clean all fastq/fasta files from /home/ncortes/Novogene_Data/trinity-assemblies/MZ168894_Afrater_trinity 2020-12-08 18:47:41,009 - phyluce_assembly_assemblo_trinity - INFO - Removing extraneous Trinity files Traceback (most recent call last): File "/home/ncortes/anaconda/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 362, in main() File "/home/ncortes/anaconda/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 341, in main cleanup_trinity_assembly_folder(output, log) File "/home/ncortes/anaconda/envs/phyluce/bin/phyluce_assembly_assemblo_trinity", line 283, in cleanup_trinity_assembly_folder raise IOError("Neither Trinity.fasta nor trinity.log were found in output.") IOError: Neither Trinity.fasta nor trinity.log were found in output.

Many thanks.

brantfaircloth commented 3 years ago

I would try spades.

NataliaCD commented 3 years ago

Okay good. Thank you!

NataliaCD commented 3 years ago

Hi Dr. Faircloth, I am wondering if the config file for spades could be the same one used for trinity. I was using the same one, but was getting this error: Traceback (most recent call last): File "/home/ncortes/anaconda/envs/phyluce/bin/phyluce_assembly_assemblo_spades", line 221, in main() File "/home/ncortes/anaconda/envs/phyluce/bin/phyluce_assembly_assemblo_spades", line 168, in main input = get_input_data(args.config, args.dir) File "/home/ncortes/github/phyluce/phyluce/raw_reads.py", line 135, in get_input_data groups = conf.items('samples') File "/home/ncortes/anaconda/envs/phyluce/lib/python2.7/ConfigParser.py", line 642, in items raise NoSectionError(section) ConfigParser.NoSectionError: No section: 'samples'

So I don't know if it's because I need to adjust the config file in another way for spades, or this is just a different problem. Thank you.

brantfaircloth commented 3 years ago

The two files should be interchangeable... But without knowing what either looks like, it's hard to diagnose the issue. It seems as if assemblo_spades is either unable to find your config file, or that the file is incorrectly formatted in some way.

NataliaCD commented 3 years ago

This is how thee config looks like: [samples] MZ168894_Afrater:/home/ncortes/Novogene_Data/clean_fastq/MZ168894_Afrater/split-adapter-quality-trimmed/ MZ168895_Afrater:/home/ncortes/Novogene_Data/clean_fastq/MZ168895_Afrater/split-adapter-quality-trimmed/

brantfaircloth commented 3 years ago

That looks ok. What is in those directories and how are you calling assemblo_spades?

NataliaCD commented 3 years ago

This is how I am calling it: (phyluce) ncortes@vortex:~$ phyluce_assembly_assemblo_spades \

--conf assembly_Afrater.conf \
--output spades-assemblies \
--cores 12

And this is an example of the content in one directory: (phyluce) ncortes@vortex:~$ ls /home/ncortes/Novogene_Data/clean_fastq/MZ168894_Afrater/split-adapter-quality-trimmed/ MZ168894_Afrater-READ1.fastq.gz MZ168894_Afrater-READ2.fastq.gz MZ168894_Afrater-READ-singleton.fastq.gz

Thank you.

brantfaircloth commented 3 years ago

Earlier, it seems like assembly_Afrater.conf is in ~/Novogene_Data/assembly_Afrater.conf, but here, you are treating it as if it's in your $HOME directory ~/assembly_Afrater.conf.

NataliaCD commented 3 years ago

Sorry, my bad. I had closed the terminal and forgot to setup the correct directory... However, I am having another issue. I am having a warning for some of the samples, but not all of them (WARNING - Did not clean all fastq/fasta files from ~/xxxx_Afrater_spades). Could this be possibly related to memory space? I have already modified the spades section in phyluce.conf.

(phyluce) ncortes@vortex:~/Novogene_Data$ phyluce_assembly_assemblo_spades --conf assembly_Afrater.conf --output spades-assemblies --cores 12 2020-12-10 15:12:06,430 - phyluce_assembly_assemblo_spades - INFO - =========== Starting phyluce_assembly_assemblo_spades =========== 2020-12-10 15:12:06,430 - phyluce_assembly_assemblo_spades - INFO - Version: git 185b705 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --config: /home/ncortes/Novogene_Data/assembly_Afrater.conf 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --cores: 12 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --dir: None 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --do_not_clean: False 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --log_path: None 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --output: /home/ncortes/Novogene_Data/spades-assemblies 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --subfolder: 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Argument --verbosity: INFO 2020-12-10 15:12:06,431 - phyluce_assembly_assemblo_spades - INFO - Getting input filenames and creating output directories 2020-12-10 15:12:06,434 - phyluce_assembly_assemblo_spades - INFO - ------------------ Processing MZ168894_Afrater ------------------ 2020-12-10 15:12:06,434 - phyluce_assembly_assemblo_spades - INFO - Finding fastq/fasta files 2020-12-10 15:12:06,436 - phyluce_assembly_assemblo_spades - INFO - File type is fastq 2020-12-10 15:12:06,437 - phyluce_assembly_assemblo_spades - INFO - Running SPAdes for PE data 2020-12-10 15:14:35,243 - phyluce_assembly_assemblo_spades - WARNING - Did not clean all fastq/fasta files from /home/ncortes/Novogene_Data/spades-assemblies/MZ168894_Afrater_spades 2020-12-10 15:14:35,244 - phyluce_assembly_assemblo_spades - INFO - Symlinking assembled contigs into /home/ncortes/Novogene_Data/spades-assemblies/contigs 2020-12-10 15:14:35,244 - phyluce_assembly_assemblo_spades - INFO - ------------------ Processing MZ168895_Afrater ------------------ 2020-12-10 15:14:35,244 - phyluce_assembly_assemblo_spades - INFO - Finding fastq/fasta files 2020-12-10 15:14:35,245 - phyluce_assembly_assemblo_spades - INFO - File type is fastq 2020-12-10 15:14:35,245 - phyluce_assembly_assemblo_spades - INFO - Running SPAdes for PE data 2020-12-10 15:37:39,908 - phyluce_assembly_assemblo_spades - WARNING - Did not clean all fastq/fasta files from /home/ncortes/Novogene_Data/spades-assemblies/MZ168895_Afrater_spades 2020-12-10 15:37:39,909 - phyluce_assembly_assemblo_spades - INFO - Symlinking assembled contigs into /home/ncortes/Novogene_Data/spades-assemblies/contigs 2020-12-10 15:37:39,909 - phyluce_assembly_assemblo_spades - INFO - ------------------ Processing MZ168896_Afrater ------------------ 2020-12-10 15:37:39,909 - phyluce_assembly_assemblo_spades - INFO - Finding fastq/fasta files 2020-12-10 15:37:39,910 - phyluce_assembly_assemblo_spades - INFO - File type is fastq 2020-12-10 15:37:39,910 - phyluce_assembly_assemblo_spades - INFO - Running SPAdes for PE data 2020-12-10 16:07:10,183 - phyluce_assembly_assemblo_spades - INFO - Symlinking assembled contigs into /home/ncortes/Novogene_Data/spades-assemblies/contigs

Thank you.

brantfaircloth commented 3 years ago

Yes, this error suggests assembly failed for some samples. If you look in the spades.log for each sample that failed, you may find more information. Usually, you ran out of RAM - so you could try to find a machine with more RAM or you could downsample your reads before assembling.

One example of how to downsample can be found here: http://protocols.faircloth-lab.org/en/latest/protocols-computer/snippets/random-computer-snippets.html#subsample-reads-for-r1-and-r2-using-seqtk

NataliaCD commented 3 years ago

Okay, I will check the log to see what is the best I can do. Thank you very much!