juliema / aTRAM

BSD 3-Clause "New" or "Revised" License
33 stars 14 forks source link

Abyss assembler issue: Missing files in temp directory #308

Open rafelafrance opened 2 years ago

rafelafrance commented 2 years ago

@zgliwen

Let's move your issue over here. please.

===================================================================================================== @rafelafrance hi,

I also had the same issue. The difference from others was that I used Abyss rather than Trinity as the assembler, and it reported the similar error: ERROR: Exception: [Errno 2] No such file or directory: '/data/results/atram_out_220407/kaM_49223_imd/atram_qfz_iu_n/kaM_49223_seq.fa_01_xdlazpp6/output.fasta-unitigs.fa'

My command was: $ atram.py --database atram_out_220407/kaM_49223 --query basic_info/kaM_seq.fa --assembler abyss -t /data/results/atram_out_220407/kaM_49223_imd --keep-temp-dir --no-filter --path miniconda3/envs/py3/bin -o /data/results/atram_out_220407/kaM_49223

However, with the fasta files created by atram in the temporary directory, I can get the assemble by Abyss-pe, and the command was $ abyss-pe np=8 name=/data/results/atram_out_220407/kaM_49223_abyss k=64 in='atram_out_220407/kaM_49223_imd/atram_qfz_iu_n/kaM_49223_seq.fa_01_xdlazpp6/paired_1.fasta atram_out_220407/kaM_49223_imd/atram_qfz_iu_n/kaM_49223_seq.fa_01_xdlazpp6/paired_2.fasta'

The version of softwares I used were: aTRAM 2.4.3 abyss 2.3.4-h41cdee2_1

Could you give me some suggestions, please? Thanks for your help.

zgliwen commented 2 years ago

Actually, I used Abyss as the assembler, not Trinity ...

rafelafrance commented 2 years ago

Noted. Thank you.

rafelafrance commented 2 years ago

@juliema I'm having trouble replicating this issue here. Have you run into this problem when working on the "Pronghorn" server?

emilyostrow commented 2 years ago

I ran into this problem when I forgot to activate my conda environment where Abyss was installed. Activating the environment fixed it. I think the program can't find abyss.

zgliwen commented 2 years ago

I ran into this problem when I forgot to activate my conda environment where Abyss was installed. Activating the environment fixed it. I think the program can't find abyss.

@emilyostrow Thanks. I actually ran aTRAM in the conda environment includes Abyss, but failed. I think you are right that aTRAM could not find Abyss in my environment, but I don't know why. Now I could get the intermediate files from aTRAM and assemble sequences with Abyss.

juliema commented 2 years ago

you could put in the path to abyss in the command. I think it is --path

that way it will find it. On Fri, Jun 17, 2022 at 10:03 PM zgliwen ***@***.***> wrote: > I ran into this problem when I forgot to activate my conda environment > where Abyss was installed. Activating the environment fixed it. I think the > program can't find abyss. > > Thanks. I actually ran aTRAM in the conda environment includes Abyss, but > failed. I think you are right that aTRAM could not find Abyss in my > environment, but I don't know why. Now I could get the intermediate files > from aTRAM and assemble sequences with Abyss. > > — > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >
zgliwen commented 2 years ago

you could put in the path to abyss in the command. I think it is --path that way it will find it.

@juliema Thanks, Juliema. I tried it, but failed again. The commands I used and the whole log file were shown below. Hope this information will be useful. Thank you.

The old command: $ atram.py --database atram_out_220407/kaM_49223 --query basic_info/kaM_seq.fa --assembler abyss -t /data/results/atram_out_220407/kaM_49223_imd --keep-temp-dir --no-filter --path miniconda3/envs/py3/bin -o /data/results/atram_out_220407/kaM_49223

The new command: $ /data/software/aTRAM/atram.py --database results/atram_out_220407/kaM_49223 --query /data/basic_info/kaM_seq.fa --assembler abyss -t results/atram_out_220407/kaM_49223_imd --keep-temp-dir --abyss-kmer 30 --no-filter --path /data/miniconda3/envs/py3/bin/abyss-pe -o results/atram_out_220407/kaM_49223_test > results/atram_out_220407/kaM_49223_test.log 2>&1

$ less results/atram_out_220407/kaM_49223_test.log ( 617211) 2022-06-24 19:47:12.072941 INFO : ################################################################################ ( 617211) 2022-06-24 19:47:12.073074 INFO : aTRAM version: v2.4.3 ( 617211) 2022-06-24 19:47:12.073095 INFO : Python version: 3.9.2 (default, Mar 3 2021, 20:02:32) [GCC 7.3.0] ( 617211) 2022-06-24 19:47:12.073110 INFO : /data/software/aTRAM/atram.py --database results/atram_out_220407/kaM_49223 --query /data/basic_info/kaM_seq.fa --assembler abyss -t results/atram_out_220407/kaM_49223_imd --keep-temp-dir --abyss-kmer 30 --no-filter --path /data/miniconda3/envs/py3/bin/abyss-pe -o results/atram_out_220407/kaM_49223_test ( 617211) 2022-06-24 19:47:12.073195 INFO : aTRAM blast DB = " atram_out_220407/kaM_49223", query = "kaM_seq.fa", iteration 1 ( 617211) 2022-06-24 19:47:12.073490 INFO : Blasting query against shards: iteration 1 ( 617211) 2022-06-24 19:47:24.697196 INFO : All 8 blast results completed ( 617211) 2022-06-24 19:47:24.697776 INFO : 1028 blast hits in iteration 1 ( 617211) 2022-06-24 19:47:24.697814 INFO : Writing assembler input files: iteration 1 ( 617211) 2022-06-24 19:47:31.853828 INFO : Assembling shards with abyss: iteration 1 ( 617211) 2022-06-24 19:47:32.022082 ERROR: Exception: [Errno 2] No such file or directory: '/data/results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM49223 seq.fa_01_qyk7lwxp/output.fasta-unitigs.fa' ( 617211) 2022-06-24 19:47:32.022342 INFO : 0 total contigs after iteration 1

juliema commented 2 years ago

ok two things. I would remove the abyss-pe from the path (if that is just the name of the program.. if it is a folder name then leave it)

Can you see if this file exists? /data/results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM49223 seq.fa_01_qyk7lwxp/output.fasta-unitigs.fa -- if it does then the program is not finding it, if it does not exist -- which is probably the case, then there are two potential issues 1. the assembler is not working or 2. no contigs are being assembled.

what locus are you assembling and do you have an idea of the coverage?

just checking that you have the right version of abyss and it is working when you run it independently? If you could just try to run the abyss-pe command separately to make sure?

happy to jump on zoom if you need to walk through it.

On Fri, Jun 24, 2022 at 5:29 AM zgliwen @.***> wrote:

you could put in the path to abyss in the command. I think it is --path that way it will find it.

@juliema https://github.com/juliema Thanks, Juliema. I tried it, but failed again. The commands I used and the whole log file were shown below. Hope this information will be useful. Thank you.

The old command: $ atram.py --database atram_out_220407/kaM_49223 --query basic_info/kaM_seq.fa --assembler abyss -t /data/results/atram_out_220407/kaM_49223_imd --keep-temp-dir --no-filter --path miniconda3/envs/py3/bin -o /data/results/atram_out_220407/kaM_49223

The new command: $ /data/software/aTRAM/atram.py --database results/atram_out_220407/kaM_49223 --query /data/basic_info/kaM_seq.fa --assembler abyss -t results/atram_out_220407/kaM_49223_imd --keep-temp-dir --abyss-kmer 30 --no-filter --path /data/miniconda3/envs/py3/bin/abyss-pe -o results/atram_out_220407/kaM_49223_test > results/atram_out_220407/kaM_49223_test.log 2>&1

$ less results/atram_out_220407/kaM_49223_test.log ( 617211) 2022-06-24 19:47:12.072941 INFO : ################################################################################ ( 617211) 2022-06-24 19:47:12.073074 INFO : aTRAM version: v2.4.3 ( 617211) 2022-06-24 19:47:12.073095 INFO : Python version: 3.9.2 (default, Mar 3 2021, 20:02:32) [GCC 7.3.0] ( 617211) 2022-06-24 19:47:12.073110 INFO : /data/software/aTRAM/atram.py --database results/atram_out_220407/kaM_49223 --query /data/basic_info/kaM_seq.fa --assembler abyss -t results/atram_out_220407/kaM_49223_imd --keep-temp-dir --abyss-kmer 30 --no-filter --path /data/miniconda3/envs/py3/bin/abyss-pe -o results/atram_out_220407/kaM_49223_test ( 617211) 2022-06-24 19:47:12.073195 INFO : aTRAM blast DB = " atram_out_220407/kaM_49223", query = "kaM_seq.fa", iteration 1 ( 617211) 2022-06-24 19:47:12.073490 INFO : Blasting query against shards: iteration 1 ( 617211) 2022-06-24 19:47:24.697196 INFO : All 8 blast results completed ( 617211) 2022-06-24 19:47:24.697776 INFO : 1028 blast hits in iteration 1 ( 617211) 2022-06-24 19:47:24.697814 INFO : Writing assembler input files: iteration 1 ( 617211) 2022-06-24 19:47:31.853828 INFO : Assembling shards with abyss: iteration 1 ( 617211) 2022-06-24 19:47:32.022082 ERROR: Exception: [Errno 2] No such file or directory: '/data/results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM49223 seq.fa_01_qyk7lwxp/output.fasta-unitigs.fa' ( 617211) 2022-06-24 19:47:32.022342 INFO : 0 total contigs after iteration 1

— Reply to this email directly, view it on GitHub https://github.com/juliema/aTRAM/issues/308#issuecomment-1165525846, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AZ3W4AWVQF7257ZFUYSTVQWSZBANCNFSM5SY6AWBA . You are receiving this because you were mentioned.Message ID: @.***>

zgliwen commented 2 years ago

@juliema Hi, Juliema.

There is no “output.fasta-unitigs.fa” file in the directory /data/results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp, but exist several json files and five fasta files.

I can successfully assemble the paired reads into a complete contig, and the command I used is: $ abyss-pe np=8 name=/data/results/atram_out_220407/kaM_49223_abyss k=64 in='results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp/paired_1.fasta results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp/paired_2.fasta'

I found that when we set the output file name in abyss, the path must be absolute. Maybe you had noticed it when writing aTRAM?

Thanks.

juliema commented 2 years ago

this is pretty common to need to go with absolute paths when running programs that call other programs. generally it is safest to always go with absolute paths. did this solve the problem for you?

On Sun, Jul 3, 2022 at 1:05 AM zgliwen @.***> wrote:

@juliema https://github.com/juliema Hi, Juliema.

There is no “output.fasta-unitigs.fa” file in the directory /data/results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp, but exist several json files and five fasta files.

I can successfully assemble the paired reads into a complete contig, and the command I used is: $ abyss-pe np=8 name=/data/results/atram_out_220407/kaM_49223_abyss k=64 in='results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp/paired_1.fasta results/atram_out_220407/kaM_49223_imd/atram_ke04m6v2/kaM_49223_seq.fa_01_qyk7lwxp/paired_2.fasta'

I found that when we set the output file name in abyss, the path must be absolute. Maybe you had noticed it when writing aTRAM?

Thanks.

— Reply to this email directly, view it on GitHub https://github.com/juliema/aTRAM/issues/308#issuecomment-1173033969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AZ3QXTRJKYEO7P5YMXYTVSFCTNANCNFSM5SY6AWBA . You are receiving this because you were mentioned.Message ID: @.***>

zgliwen commented 2 years ago

@juliema No, it does not work. Even I changed all the paths to the absolute ones, it still said could not find output.fasta-unitigs.fa. I am quite sure now that the reason is aTRAM can not find Abyss in my server, but I do not guess why. Maybe that is because I install Abyss using bioconda?

Could you please show me your command and log file when running aTRAM? Perhaps I could guess some other reasons. Thanks.

this is pretty common to need to go with absolute paths when running programs that call other programs. generally it is safest to always go with absolute paths. did this solve the problem for you?

Erythroxylum commented 1 year ago

Hello @juliema Thanks for a great program! I have some new WGS data i want to assemble, as I have in 2019 and 2021, but atram.py is no longer working. I am getting the same "[Errno 2] No such file or directory: '/tmp/atram_*". I have reinstalled all programs via conda as well as independently from source and used "--path" to designate these executable folders. I have also created and directed to a tmp folder and --keep-temp-dir There are no files being written to the /tmp/ directories I have run trinity, velvet, abyss, and spades with no success. Please help!

atram.out.txt

juliema commented 1 year ago

Hi Dawson,

I just looked at your atram.out file and I did not see the --keep-temp-dir option in the atram commands. Could you double check this?

/home/FM/dwhite/miniconda3/envs/aTRAM/bin/atram.py --blast-db /home/FM/dwhite/data_active/erythroxylum/atram_db_bb/bana_1447 --query-split /home/FM/dwhite/hybd3/atram-wgs2022/g287_maj_consensus.fasta --assembler spades --output-prefix /home/FM/dwhite/hybd3/atram-wgs2022/bana_1447 --path /home/FM/dwhite/programs/spades/assembler/

On Fri, Jun 30, 2023 at 12:27 PM Dawson White @.***> wrote:

Hello @juliema https://github.com/juliema Thanks for a great program! I have some new WGS data i want to assemble, as I have in 2019 and 2021, but atram.py is no longer working. I am getting the same "[Errno 2] No such file or directory: '/tmp/atram_*". I have reinstalled all programs via conda as well as independently from source and used "--path" to designate these executable folders. I have also created and directed to a tmp folder and --keep-temp-dir There are no files being written to the /tmp/ directories I have run trinity, velvet, abyss, and spades with no success. atram.out.txt https://github.com/juliema/aTRAM/files/11919906/atram.out.txt

— Reply to this email directly, view it on GitHub https://github.com/juliema/aTRAM/issues/308#issuecomment-1615107066, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AZ3R4354NC5JT376VLGTXN4SBVANCNFSM5SY6AWBA . You are receiving this because you were mentioned.Message ID: @.***>

Erythroxylum commented 1 year ago

Hi @juliema, Thanks for your attention! Sorry that output didn't have that flag and others. I attached a new output file plus the tmp folder (has other iterations with other assemblers also). The json files do show blast hits with the names of reads but none of the fasta folders have content, and the spades folder is empty (or alternatively the trinity.* file isn't written).

Could this be a BLAST issue? It is using the atram blast: blastn: blast 2.12.0, build Jul 13 2021 09:03:00 -But I also tried blast 2.14.0 with the same results, by using the path flag.

spades issue? SPAdes v3.13.1

tmp.tar.gz nohup.out.txt

juliema commented 1 year ago

I dont think it is a blast issue in that there is a problem with blast. It looks like it is finding only a few reads, and therefore not able to assemble anything. If you look at the output it says it only found 2 hits. That means it is not going to be able to assemble anything and will not create the temp files.

What are you trying to assemble? How closely related to your target taxon is your reference sequence? Are you using DNA or Protein as the query?

On Mon, Jul 3, 2023 at 10:10 AM Dawson White @.***> wrote:

Hi @juliema https://github.com/juliema, Thanks for your attention! Sorry that output didn't have that flag and others. I attached a new output file plus the tmp folder (has other iterations with other assemblers also). The json folders list reads, none of the fasta folders have content, and the spades/ folder is empty (or alternatively the trinity.* file isn't written).

Could this be a BLAST issue? It is using the atram blast: blastn: blast 2.12.0, build Jul 13 2021 09:03:00 -But I also tried blast 2.14.0 with the same results, by using the path flag.

spades issue? SPAdes v3.13.1

tmp.tar.gz https://github.com/juliema/aTRAM/files/11939653/tmp.tar.gz nohup.out.txt https://github.com/juliema/aTRAM/files/11939655/nohup.out.txt

— Reply to this email directly, view it on GitHub https://github.com/juliema/aTRAM/issues/308#issuecomment-1618902298, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AZ3SN2JZIUVHOLIBCGATXOL4HHANCNFSM5SY6AWBA . You are receiving this because you were mentioned.Message ID: @.***>

Erythroxylum commented 1 year ago

Makes sense that 2 hits won't assemble anything, sorry for sending the poor example. However, I reran all my samples and some have hundreds to thousands (20k, 105k) of blast hits, and the same error appears. The majority are in the 40-80 range. These are a low coverage WGS samples and the targets are custom nuclear exon dna seqs, so phylogenetic distance is not an issue. blasthits.txt atram.out.txt

genomic-tech commented 5 months ago

Hi guys, I'm also facing the same issue, is there any way to solve this ?

My command: python3 atram.py --max-processes=12 --blast-db=/dgxb_home/se23plsc009/Thrips_Project/results/aTRAM_db/T1.sqlite.db --query-split=/dgxb_home/se23plsc009/Thrips_Project/data/ref_orthologs/2395_query_file --output-prefix=/dgxb_home/se23plsc009/Thrips_Project/results/Temporary/Output.fasta-unitigs.fa --assembler=abyss --path=/dgxb_home/se23plsc009/softwares/aTRAM/lib/assemblers/velvet.py --iterations=3 --max-target-seqs=3000 --log-file=/dgxb_home/se23plsc009/Thrips_Project/results/aTRAM_assembly1/log_file --temp-dir=/dgxb_home/se23plsc009/Thrips_Project/results/aTRAM_assembly1/Temporary/ --keep-temp-dir --protein

Error: Exception: [Errno 2] No such file or directory: '/dgxb_home/se23plsc009/Thrips_Project/results/aTRAM_assembly/Temporary/atram_nowsbxk1/T1_2395_query_file_RPRC015459_PA_2395.fasta_01_clkdfnle/output.fasta-unitigs.fa'

HiranyaSudasinghe commented 3 months ago

Hello,

I had the same issue when I installed and used the latest version of abyss through conda: conda install -c bioconda -c conda-forge abyss

This file: output.fasta-unitigs.fa does not exist in the tmp dir and I get the error: Exception: [Errno 2] No such file or directory: '/path/to/output.fasta-unitigs.fa'

However, then I installed abyss v 2.0.2 as recommended in aTRAM: conda create -n atram-abyss abyss=2.0.2=h51208dd_5 -c bioconda

and added the path to abyss in the command. Now the abyss assembly is working as expected with aTRAM!

path to abyss

export PATH="/path/to/miniconda3/envs/atram-abyss/bin/:$PATH"

path to aTRAM

export PATH="/path/to/bin/aTRAM/:$PATH"

aTRAM version used

v2.4.4

aTRAM command used

atram.py -b ${SPECIES} -Q ${LIBRARY} -a abyss -o ${SPECIES}-local-assembly --cpus 20 --temp-dir ${TMPDIR}