Closed brantfaircloth closed 3 years ago
Hi Dr. Faircloth,
Thank you for the super quick reply. I am running through the tutorial before jumping into my data.
I am under the assumption that the --output directories is what I denoted for each run, illumiprocessor and phyluce_assembly_assemblo_trinity. Below is what is given using the command ls -alh.
illumiprocessor (directory clean_reads [labelled --output clean-fastq] in the tutorial) ls -alh total 24 drwxr-xr-x 7 jamestmcquillan staff 238B Dec 15 14:16 . drwxr-xr-x 10 jamestmcquillan staff 340B Dec 15 14:38 .. -rw-r--r--@ 1 jamestmcquillan staff 8.0K Dec 15 14:34 .DS_Store drwxr-xr-x 7 jamestmcquillan staff 238B Dec 15 14:25 alligator_mississippiensis drwxr-xr-x 6 jamestmcquillan staff 204B Dec 15 14:13 anolis_carolinensis drwxr-xr-x 6 jamestmcquillan staff 204B Dec 15 14:13 gallus_gallus drwxr-xr-x 6 jamestmcquillan staff 204B Dec 15 14:13 mus_musculus
phyluce_assembly_assemblo_trinity: (directory trinity_assemblies [labelled --output trinity_assemblies]) ls -alh total 16 drwxr-xr-x 5 jamestmcquillan staff 170B Dec 15 15:10 . drwxr-xr-x 10 jamestmcquillan staff 340B Dec 15 14:38 .. -rw-r--r--@ 1 jamestmcquillan staff 6.0K Dec 15 15:10 .DS_Store drwxr-xr-x 9 jamestmcquillan staff 306B Dec 15 14:38 alligator_mississippiensis_trinity drwxr-xr-x 2 jamestmcquillan staff 68B Dec 15 14:38 contigs
Thank you for your time.
Best, James
hi james,
can I also see the ls -alh
output from alligator_mississippiensis_trinity
?
Sure here is that output.
ls -alh total 40 drwxr-xr-x 9 jamestmcquillan staff 306B Dec 15 14:38 . drwxr-xr-x 5 jamestmcquillan staff 170B Dec 15 15:10 .. -rw-r--r-- 1 jamestmcquillan staff 559B Dec 15 14:38 Trinity.timing -rw-r--r-- 1 jamestmcquillan staff 192B Dec 15 14:38 alligator_mississippiensis-READ1.fastq.readcount -rw-r--r-- 1 jamestmcquillan staff 192B Dec 15 14:38 alligator_mississippiensis-READ2.fastq.readcount drwxr-xr-x 2 jamestmcquillan staff 68B Dec 15 14:38 chrysalis -rw-r--r-- 1 jamestmcquillan staff 0B Dec 15 14:38 left.fa -rw-r--r-- 1 jamestmcquillan staff 0B Dec 15 14:38 right.fa -rw-r--r-- 1 jamestmcquillan staff 4.6K Dec 15 14:38 trinity.log
ok, and now the output from trinity.log
? It doesn't look like the assembly is completing (which is why you have no files to cleanup when you run --clean
.
Here is the trinity log file.
The assembly is not completing. It is erroring out, error below.
$ phyluce_assembly_assemblo_trinity \
--conf assembly.conf \ --output trinity_assemblies \ --clean \ --cores 8 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - =========== Starting phyluce_assembly_assemblo_trinity ========== 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Version: 1.5.0 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --clean: True 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --config: /Users/jamestmcquillan/Desktop/uce-Tutorial/assembly.conf 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --cores: 8 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --dir: None 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --log_path: None 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --min_kmer_coverage: 2 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --output: /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies 2015-12-15 14:38:53,236 - phyluce_assembly_assemblo_trinity - INFO - Argument --subfolder: 2015-12-15 14:38:53,237 - phyluce_assembly_assemblo_trinity - INFO - Argument --verbosity: INFO 2015-12-15 14:38:53,237 - phyluce_assembly_assemblo_trinity - INFO - Getting input filenames and creating output directories 2015-12-15 14:38:53,238 - phyluce_assembly_assemblo_trinity - INFO - ------------- Processing alligator_mississippiensis ------------- 2015-12-15 14:38:53,239 - phyluce_assembly_assemblo_trinity - INFO - Finding fastq/fasta files 2015-12-15 14:38:53,240 - phyluce_assembly_assemblo_trinity - INFO - File type is fastq 2015-12-15 14:38:53,241 - phyluce_assembly_assemblo_trinity - INFO - Copying raw read data to /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity 2015-12-15 14:38:53,717 - phyluce_assembly_assemblo_trinity - INFO - Combining singleton reads with R1 data 2015-12-15 14:38:53,750 - phyluce_assembly_assemblo_trinity - INFO - Running Trinity.pl for PE data 2015-12-15 14:38:57,446 - phyluce_assembly_assemblo_trinity - INFO - Removing extraneous Trinity files Traceback (most recent call last): File "/Users/jamestmcquillan/anaconda/bin/phyluce_assembly_assemblo_trinity", line 347, in
main() File "/Users/jamestmcquillan/anaconda/bin/phyluce_assembly_assemblo_trinity", line 326, in main cleanup_trinity_assembly_folder(output, log) File "/Users/jamestmcquillan/anaconda/bin/phyluce_assembly_assemblo_trinity", line 276, in cleanup_trinity_assembly_folder raise IOError("Neither Trinity.fasta nor trinity.log were found in output.") IOError: Neither Trinity.fasta nor trinity.log were found in output.
Ok, it looks like the fastq files are not being converted to fasta files correctly (this explains why you have left.fa and right.fa files of 0 bytes in size). Are you sure that your assembly.conf file is setup correctly? For example, does the file exist at:
/Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.gz
Also, it looks like you are using OSX. Can you tell me which version and how much RAM you have?
-b
The assembly.conf exists at the base directory of all this, uce-Tutorial.
I am using OSX El Capitan ver. 10.11.1 (15B42) with 16 GB 1600 MHz DDR3 of ram on this computer.
Best, James
I am not sure what's going on. Basically, the problem lies in fastool
running against your data to convert the fastq files into fasta files (which is one of the first trinity steps). That fails, as you can see in your trinity.log, with:
Tuesday, December 15, 2015: 14:38:56 CMD: /Users/jamestmcquillan/anaconda/bin/trinity-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ2.fastq >> right.fa 2> /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ2.fastq.readcount
bash: line 1: 5004 Trace/BPT trap: 5 /Users/jamestmcquillan/anaconda/bin/trinity-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq >> left.fa 2> /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.readcount
Thread 1 terminated abnormally: Error, cmd: /Users/jamestmcquillan/anaconda/bin/trinity-2.0.6/trinity-plugins/fastool/fastool --illumina-trinity --to-fasta /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq >> left.fa 2> /Users/jamestmcquillan/Desktop/uce-Tutorial/trinity_assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.readcount died with ret 34048 at /Users/jamestmcquillan/anaconda/bin/Trinity line 2116.
Use of uninitialized value in array dereference at /Users/jamestmcquillan/anaconda/bin/Trinity line 1211.
Because that fails, everything else fails, which explains your 0 size left.fa
and right.fa
files as well as the subsequent error messages.
I am not sure what is causing this problem, but RAM issues are one possibility. Running El Capitan is another (phyluce
is tested to run on 10.10 but not 10.11, yet). Right now, this seems to be a trinity issue but could also be the result of running the code on OSX 10.11.
That said, check your fastq.gz
files in the split-adapter-quality-trimmed
folder, as well, to make sure they have content (e.g. they have some file size and a number of reads in them).
Thank you Dr. Faircloth,
I appreciate all of your time. I will play around with what you said more and get back you if I find solution on this mac.
Cheers, James
Hello Brant and/or James,
We are having the exact same problem here when trying to assemble with trinity. I am wondering if a solution was ever found for this?
It is the same error, on OSX 10.11 using 32 GB RAM. The fastq files in the split-adapter-quality-trimmed folder have content, and assemblies worked with velvet.
Thanks, Shahan
If it's the same error in trinity.log, that may be a RAM issue. One solution is to switch versions of Trinity (e.g. get an older version and compile it yourself). I actually use an older version, and have updated my ~/.phyluce.conf to use that version:
[trinity]
trinity:$HOME/src/trinityrnaseq_r2013-02-25/Trinity.pl
kmer_coverage:2
jellyfish_memory:24G
I still don't get any problems when running the "new" version of trinity with the tutorial data... which is what makes me think the issue is with too little RAM on some computers (I have 48-64 GB on most of mine).
Thanks Brant. I tried it with a previous version of Trinity, and it seems to get past that first error. However, I still can't run actually assemble because it seems trinity requires linux to compile. I probably should have checked that first! We will probably just have to assemble with trinity on our server, then bring the files back into the phyluce pipeline.
I'm using the version of Trinity that you are packaging in the new build in bioconda. I see similar problems and not much in the way of output in trinity-assemblies/alligator_mississippiensis_trinity.
I do note the following in trinity.log :
Trinity version: v2.1.1
** NOTE: Latest version of Trinity is Trinity-v2.6.6, and can be obtained at:
https://github.com/trinityrnaseq/trinityrnaseq/releases
which: no bowtie in (/usr/local/miniconda2/opt/trinity-2.1.1/trinity-plugins/BIN:/usr/local/miniconda2/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin)
which: no bowtie-build in (/usr/local/miniconda2/opt/trinity-2.1.1/trinity-plugins/BIN:/usr/local/miniconda2/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin)
Error, cannot find path to bowtie () or bowtie-build (), which is now needed as part of Chrysalis' readscaffolding step. If you should choose to not run bowtie, include the --no_bowtie in your Trinity command.
Try to conda install bowtie
and rerun to see if that fixes the error. If so, I'll adjust the package for phyluce. Technically, the Trinity package should also include bowtie, and i'll suggest a change for that package, too.
It appears that bowtie certainly isn't being pulled in correctly by Trinity, so thanks for that.
With bowtie installed, things get a bit further and now I get a error message which seems to indicate that processing stopped because of a mismatch between gunzip | wc -l and fastool. Please let me know if you think I should raise a separate issue for that. In the meantime I will double check and make sure I've been following all the instruction correctly.
No, just leave things here. The issue is really in the build of Trinity from bioconda (so a bioconda issue). We may be able to fix by switching to a more recent version of Trinity than the one pulled by default. Can you send me the text of the last issue and/or the log?
I get this in the Trinity log:
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
Converting input files. (in parallel)Wednesday, June 20, 2018: 14:46:40 CMD: gunzip -c /opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.gz | fastool --illumina-trinity --to-fasta >> left.fa 2> /opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.gz.readcount
Wednesday, June 20, 2018: 14:46:40 CMD: gunzip -c /opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ2.fastq.gz | fastool --illumina-trinity --to-fasta >> right.fa 2> /opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ2.fastq.gz.readcount
gzip: stdout: Broken pipe
Thread 1 terminated abnormally: Error, counts of reads in FQ: 1705959 (as per gunzip -c /opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligator_mississippiensis_trinity/alligator_mississippiensis-READ1.fastq.gz | wc -l) doesn't match fastool's report of FA records: 1573739 at /usr/local/miniconda2/bin/Trinity line 3060 thread 1.
main::ensure_complete_FQtoFA_conversion("gunzip -c /opt/tmp/test_phyluce/uce-tutorial/trinity-assembli"..., "/opt/tmp/test_phyluce/uce-tutorial/trinity-assemblies/alligat"...) called at /usr/local/miniconda2/bin/Trinity line 2099 thread 1
main::prep_seqs(ARRAY(0x1292500), "fq", "left", undef) called at /usr/local/miniconda2/bin/Trinity line 1310 thread 1
eval {...} called at /usr/local/miniconda2/bin/Trinity line 1310 thread 1
-conversion of 1573403 from FQ to FA format succeeded.
Trinity run failed. Must investigate error above.
I'm trying to work through the docs here:
http://phyluce.readthedocs.io/en/latest/tutorial-one.html so that is where I got the data from.
Something appears off with gzip/the config file and/or the reads going into the assembly - basically what's happening is that Trinity is dying because wc -l
reports a different number of reads in the READ1 file from that reported by fastool
. It looks like this might be occurring because gzip
dies before getting all the way through the read file. I am not sure why that's happening (are you out of disk space? RAM?).
However, I'm also not seeing this error with fresh installs of phyluce
in either centos 6 or centos 7 (and Trinity 2.1.1).
Thanks again for your prompt reply. I will investigate. My build environment is slightly odd in that I am trying to install Phyluce in a Singularity container for use on an HPC cluster. I'm actually trying to implement the tutorial as a kind of test script so that I can be fairly sure that everything is working reliably. I would not be at all surprised if this is the cause.
One thing to check might be to run the commands in the log independent of Trinity to compare results. That's one of the things I checked: (a) the output of gunzip -c file.fastq.gz | wc -l
(divided by 4) versus the output from fastool
for the same read files.
On another front, unittests for all the major scripts are coming as part of the Python 3 porting that I am doing - that should also help ensure more reliable (and consistent) operation.
As far as I can tell those agree with each other. I cant find any gz files under "trinity-assemblies" ( I guess these are extracted / removed by the processing itself) but if I run fastool against the file in "split-adapter-quality-trimmed" I get: Sequences parsed: 1573403
gunzip -c alligator_mississippiensis-READ1.fastq.gz | wc -l
yields 6293612 (4 * 1573403).
Don't worry too much about this though Dr. Faircloth - I'll track your progress on the re-write (at least until my user gets a bit more impatient!).
I am getting the same "Neither Trinity.fasta nor trinity.log were found in output." error as well with the tutorial data, so it is a RAM issue with the new Trinity?
possibly - have a look within the file where data were being assembled (e.g. navigate into your output folder) there should be some files and one of those should be output from the critter it was working on when a problem arose. That file may provide clues.
Hi Dr. Faircloth, my trinity.log error is Java:
Error, Trinity requires access to Java version 1.7. Currently installed version is: openjdk version "1.8.0_121" OpenJDK Runtime Environment (Zulu 8.20.0.5-macosx) (build 1.8.0_121-b15) OpenJDK 64-Bit Server VM (Zulu 8.20.0.5-macosx) (build 25.121-b15, mixed mode)
Should I downgrade Java?
Thanks
the correct java should be installed as part of the current phyluce package. you could also downgrade or installed and switch to the older version.
I installed the new version of phyluce but trinity is no longer supported on Mac OS. Will find ways to get around this, thank you for your help!
Hello Dr. Faircloth,
I am encountering the same problems as described in previous posts: both left.fa and right.fa files are empty. I am using a MAC with 32 of RAM (v. 10.13.6). The phyluce version I am using is 1.5.0 and the Trinity version is 2.0.6. This is the error that I am getting:
Traceback (most recent call last):
File "/Users/alejandrapanzera/miniconda2/bin/phyluce_assembly_assemblo_trinity", line 347, in
which is the same as JETius.
wireless-10-104-12-225:WA01_Batx_trinity alejandrapanzera$ ls -alh total 56 drwxr-xr-x 10 alejandrapanzera staff 320B Sep 6 14:39 . drwxr-xr-x 5 alejandrapanzera staff 160B Sep 6 14:38 .. -rw-r--r--@ 1 alejandrapanzera staff 8.0K Sep 6 14:39 .DS_Store -rw-r--r--@ 1 alejandrapanzera staff 487B Sep 6 14:38 Trinity.timing -rw-r--r-- 1 alejandrapanzera staff 195B Sep 6 14:38 WA01_Batx-READ1.fastq.readcount -rw-r--r--@ 1 alejandrapanzera staff 195B Sep 6 14:38 WA01_Batx-READ2.fastq.readcount drwxr-xr-x 2 alejandrapanzera staff 64B Sep 6 14:38 chrysalis -rw-r--r-- 1 alejandrapanzera staff 0B Sep 6 14:38 left.fa -rw-r--r-- 1 alejandrapanzera staff 0B Sep 6 14:38 right.fa -rw-r--r--@ 1 alejandrapanzera staff 3.7K Sep 6 14:38 trinity.log
Anybody has any idea on how this can (or if this can) be fixed? I read on another thread something about not recognizing the "_trinity" part of the folder name with the output (in my case, "WA01_Batx_trinity"). Maybe that is one of the problems?
I thank you very much in advanced.
Alejandra
For whatever reason, Trinity is not functioning correctly. The answer may be in the trinity.log
file in the output you show above (you could be out of RAM). That said, Trinity is no longer supported on the Mac because it is so hard to deal with. I would suggest trying an alternate assembler on the mac (spades) or trying to get the assemblies to run on linux.
Hello Dr. Faircloth, and thank you very much for the rapid response.
I am attaching my trinity.log file. I would really appreciate if you could take a look at it as I don't recognize the errors, even when I am looking them up in the Trinity file.
If you believe there is no hope, I will follow your advise and use another assembler.
Thank you again!
Alejandra
fastool
, which is one of the programs trinity uses appears to be failing. this may be because you have too little RAM on your computer for the size of the files you are trying to assemble. it is hard to say. you can run the command individually:
fastool --illumina-trinity --to-fasta /Users/alejandrapanzera/Desktop/RapidGenomics_UCE/Trinity_assemblies/WA01_Batx_trinity/WA01_Batx-READ1.fastq >> left.fa 2> /Users/alejandrapanzera/Desktop/RapidGenomics_UCE/Trinity_assemblies/WA01_Batx_trinity/WA01_Batx-READ1.fastq.readcount
to see if it fails for you when running manually (it should). To determine whether the number of reads may be causing the problem, you could try to run the same command as above, but substitute a smaller file to see if it will run successfully. If the smaller file runs successfully, RAM is probably the issue.
Hello,
I tried with smaller files and get the exact same errors. I will be using another assembler I guess. Thank you for your time and help!
I encountered the same problem in Phyluce 1.6.7 with Trinity 2.1.1 when trying to run tutorial examples and for me this was not a memory issue.
The error is also referenced here: https://github.com/trinityrnaseq/trinityrnaseq/issues/139. It seemed to only be a problem with gzipped files, so someone proposed unzipping files first as a solution. This was apparently addressed by the Trinity team more permanently by moving away from fastool to seqtk in newer versions, which are not available with a conda build of phyluce.
What is bizarre is that the error occurs about 50% of the time for me and I noticed that it does not happen when using a single core. The solution that worked for me was to just unzip all fastq files prior to running phyluce_assembly_assemblo_trinity
. One can do this quickly by running a loop in your clean reads directory:
for d in `find . -name "*split*"`; do gunzip $d/*fastq.gz; done
Trinity is about to be removed entirely from phyluce (in 1.7.0), so closing this.
Carrying over separate issue from #41.