faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
78 stars 49 forks source link

trinity/abyss assembly error: "The appear to be multiple files for R1/R2/Singleton reads" #178

Closed kkmills closed 3 years ago

kkmills commented 4 years ago

Hi Brant,

I am trying to do an assembly with abyss, and I got the error "The appear to be multiple files for R1/R2/Singleton reads". The folders with the clean UCEs look like they are set up correctly, but I tried re-running illumiprocessor on the raw data again and outputting everything to new folders just in case. I still get the error, and I get the same error if I try assembling with trinity.

(phyluce) kendall@DESKTOP-NOSAIQH:~/marmot-uces$ phyluce_assembly_assemblo_abyss --config assembly-config2 --output ~/marmot-uces/assemblies --kmer 60 --cores 2 --clean --log-path log
2019-11-21 15:26:14,251 - phyluce_assembly_assemblo_abyss - INFO - ============ Starting phyluce_assembly_assemblo_abyss ===========
2019-11-21 15:26:14,253 - phyluce_assembly_assemblo_abyss - INFO - Version: git fatal: not a git repository: '/home/kendall/miniconda2/envs/phyluce/lib/python2.7/site-packages/.git'
2019-11-21 15:26:14,257 - phyluce_assembly_assemblo_abyss - INFO - Argument --abyss_se: False
2019-11-21 15:26:14,257 - phyluce_assembly_assemblo_abyss - INFO - Argument --clean: True
2019-11-21 15:26:14,259 - phyluce_assembly_assemblo_abyss - INFO - Argument --config: /home/kendall/marmot-uces/assembly-config2
2019-11-21 15:26:14,260 - phyluce_assembly_assemblo_abyss - INFO - Argument --cores: 2
2019-11-21 15:26:14,261 - phyluce_assembly_assemblo_abyss - INFO - Argument --dir: None
2019-11-21 15:26:14,263 - phyluce_assembly_assemblo_abyss - INFO - Argument --kmer: 60
2019-11-21 15:26:14,264 - phyluce_assembly_assemblo_abyss - INFO - Argument --log_path: /home/kendall/marmot-uces/log
2019-11-21 15:26:14,266 - phyluce_assembly_assemblo_abyss - INFO - Argument --output: /home/kendall/marmot-uces/assemblies
2019-11-21 15:26:14,268 - phyluce_assembly_assemblo_abyss - INFO - Argument --subfolder:
2019-11-21 15:26:14,268 - phyluce_assembly_assemblo_abyss - INFO - Argument --verbosity: INFO
2019-11-21 15:26:14,269 - phyluce_assembly_assemblo_abyss - INFO - Getting input filenames and creating output directories
2019-11-21 15:26:14,281 - phyluce_assembly_assemblo_abyss - INFO - -------------- Processing marmota_baibacina_OB23930 -------------
2019-11-21 15:26:14,282 - phyluce_assembly_assemblo_abyss - INFO - Finding fastq/fasta files
2019-11-21 15:26:14,296 - phyluce_assembly_assemblo_abyss - INFO - File type is fasta
Traceback (most recent call last):
  File "/home/kendall/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_abyss", line 323, in <module>
    main()
  File "/home/kendall/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_abyss", line 295, in main
    reads = get_input_files(dir, args.subfolder, log)
  File "/home/kendall/miniconda2/envs/phyluce/lib/python2.7/site-packages/phyluce/raw_reads.py", line 122, in get_input_files
    raise IOError("The appear to be multiple files for R1/R2/Singleton reads")
IOError: The appear to be multiple files for R1/R2/Singleton reads

My files look like this

(phyluce) kendall@DESKTOP-NOSAIQH:~/marmot-uces$ ls clean-uces2/marmota_baibacina_OB23930/
adapters.fasta  raw-reads  split-adapter-quality-trimmed  stats
(phyluce) kendall@DESKTOP-NOSAIQH:~/marmot-uces$ ls clean-uces2/marmota_baibacina_OB23930/split-adapter-quality-trimmed/
marmota_baibacina_OB23930-READ-singleton.fastq.gz  marmota_baibacina_OB23930-READ2.fastq.gz
marmota_baibacina_OB23930-READ1.fastq.gz

And the assembly log looks like this

2019-11-21 15:26:14,251 - phyluce_assembly_assemblo_abyss - INFO - ============ Starting phyluce_assembly_assemblo_abyss ===========
2019-11-21 15:26:14,253 - phyluce_assembly_assemblo_abyss - INFO - Version: git fatal: not a git repository: '/home/kendall/miniconda2/envs/phyluce/lib/python2.7/site-packages/.g$2019-11-21 15:26:14,257 - phyluce_assembly_assemblo_abyss - INFO - Argument --abyss_se: False
2019-11-21 15:26:14,257 - phyluce_assembly_assemblo_abyss - INFO - Argument --clean: True
2019-11-21 15:26:14,259 - phyluce_assembly_assemblo_abyss - INFO - Argument --config: /home/kendall/marmot-uces/assembly-config2
2019-11-21 15:26:14,260 - phyluce_assembly_assemblo_abyss - INFO - Argument --cores: 2
2019-11-21 15:26:14,261 - phyluce_assembly_assemblo_abyss - INFO - Argument --dir: None
2019-11-21 15:26:14,263 - phyluce_assembly_assemblo_abyss - INFO - Argument --kmer: 60
2019-11-21 15:26:14,264 - phyluce_assembly_assemblo_abyss - INFO - Argument --log_path: /home/kendall/marmot-uces/log
2019-11-21 15:26:14,266 - phyluce_assembly_assemblo_abyss - INFO - Argument --output: /home/kendall/marmot-uces/assemblies
2019-11-21 15:26:14,268 - phyluce_assembly_assemblo_abyss - INFO - Argument --subfolder:
2019-11-21 15:26:14,268 - phyluce_assembly_assemblo_abyss - INFO - Argument --verbosity: INFO
2019-11-21 15:26:14,269 - phyluce_assembly_assemblo_abyss - INFO - Getting input filenames and creating output directories
2019-11-21 15:26:14,281 - phyluce_assembly_assemblo_abyss - INFO - -------------- Processing marmota_baibacina_OB23930 -------------
2019-11-21 15:26:14,282 - phyluce_assembly_assemblo_abyss - INFO - Finding fastq/fasta files
2019-11-21 15:26:14,296 - phyluce_assembly_assemblo_abyss - INFO - File type is fasta

Do you know what could be causing this?

brantfaircloth commented 4 years ago

can you send your config file, too? one option is to try assembling only a single individual’s worth of data... if that works then things are specified OK. my guess is that here, something is just a little off.

kkmills commented 4 years ago

I just tried assembling only one individual and got the same error.

Here is the config:

[samples]
marmota_baibacina_OB23930:~/marmot-uces/clean-uces/marmota_baibacina_OB23930
marmota_baibacina_OB25269:~/marmot-uces/clean-uces/marmota_baibacina_OB25269
marmota_baibacina_OB25270:~/marmot-uces/clean-uces/marmota_baibacina_OB25270
marmota_bobak_OB23908:~/marmot-uces/clean-uces/marmota_bobak_OB23908
marmota_bobak_OB24456:~/marmot-uces/clean-uces/marmota_bobak_OB24456
marmota_bobak_OB35663:~/marmot-uces/clean-uces/marmota_bobak_OB35663
marmota_camtschatica_OB23764:~/marmot-uces/clean-uces/marmota_camtschatica_OB23764
marmota_camtschatica_OB23901:~/marmot-uces/clean-uces/marmota_camtschatica_OB23901
marmota_camtschatica_OB23977:~/marmot-uces/clean-uces/marmota_camtschatica_OB23977
marmota_camtschatica_OB24676:~/marmot-uces/clean-uces/marmota_camtschatica_OB24676
marmota_caudata_UF26566:~/marmot-uces/clean-uces/marmota_caudata_UF26566
marmota_caudata_UF26567:~/marmot-uces/clean-uces/marmota_caudata_UF26567
marmota_caudata_UF26569:~/marmot-uces/clean-uces/marmota_caudata_UF26569
marmota_himalayana_OB26974:~/marmot-uces/clean-uces/marmota_himalayana_OB26974
marmota_himalayana_RSH4478:~/marmot-uces/clean-uces/marmota_himalayana_RSH4478
marmota_kastschenkoi_OB24436:~/marmot-uces/clean-uces/marmota_kastschenkoi_OB24436
marmota_kastschenkoi_OB24437:~/marmot-uces/clean-uces/marmota_kastschenkoi_OB24437
marmota_marmota_M00018:~/marmot-uces/clean-uces/marmota_marmota_M00018
marmota_marmota_OB24500:~/marmot-uces/clean-uces/marmota_marmota_OB24500
marmota_marmota_OB24504:~/marmot-uces/clean-uces/marmota_marmota_OB24504
marmota_menzbieri_OB23863:~/marmot-uces/clean-uces/marmota_menzbieri_OB23863
marmota_menzbieri_OB25305:~/marmot-uces/clean-uces/marmota_menzbieri_OB25305
marmota_monax_UAM92626:~/marmot-uces/clean-uces/marmota_monax_UAM92626
marmota_monax_UAM134826:~/marmot-uces/clean-uces/marmota_monax_UAM134826
marmota_olympus_UWBM79553:~/marmot-uces/clean-uces/marmota_olympus_UWBM79553
marmota_olympus_UWBM79849:~/marmot-uces/clean-uces/marmota_olympus_UWBM79849
marmota_sibirica_OB24817:~/marmot-uces/clean-uces/marmota_sibirica_OB24817
marmota_sibirica_OB24846:~/marmot-uces/clean-uces/marmota_sibirica_OB24846
marmota_sibirica_OB25705:~/marmot-uces/clean-uces/marmota_sibirica_OB25705
marmota_sibirica_OB_25801:~/marmot-uces/clean-uces/marmota_sibirica_OB25801
brantfaircloth commented 4 years ago

can you try a single file and use an absolute path to the data (rather than a relative path)?

brantfaircloth commented 4 years ago

oops - single individual

kkmills commented 4 years ago

Ok, new config file is

[samples]
marmota_baibacina_OB23930:/home/kendall/marmot-uces/test-marmot/marmota_baibacina_OB23930

I got the same error:

(phyluce) kendall@DESKTOP-NOSAIQH:~/marmot-uces$ phyluce_assembly_assemblo_abyss --config assembly-config-test --output ~/marmot-uces/assemblies-test-marmot --kmer 60 --cores 2 --clean --log-path log
2019-11-21 16:34:30,562 - phyluce_assembly_assemblo_abyss - INFO - ============ Starting phyluce_assembly_assemblo_abyss ===========
2019-11-21 16:34:30,564 - phyluce_assembly_assemblo_abyss - INFO - Version: git fatal: not a git repository: '/home/kendall/miniconda2/envs/phyluce/lib/python2.7/site-packages/.git'
2019-11-21 16:34:30,564 - phyluce_assembly_assemblo_abyss - INFO - Argument --abyss_se: False
2019-11-21 16:34:30,565 - phyluce_assembly_assemblo_abyss - INFO - Argument --clean: True
2019-11-21 16:34:30,566 - phyluce_assembly_assemblo_abyss - INFO - Argument --config: /home/kendall/marmot-uces/assembly-config-test
2019-11-21 16:34:30,567 - phyluce_assembly_assemblo_abyss - INFO - Argument --cores: 2
2019-11-21 16:34:30,567 - phyluce_assembly_assemblo_abyss - INFO - Argument --dir: None
2019-11-21 16:34:30,568 - phyluce_assembly_assemblo_abyss - INFO - Argument --kmer: 60
2019-11-21 16:34:30,568 - phyluce_assembly_assemblo_abyss - INFO - Argument --log_path: /home/kendall/marmot-uces/log
2019-11-21 16:34:30,569 - phyluce_assembly_assemblo_abyss - INFO - Argument --output: /home/kendall/marmot-uces/assemblies-test-marmot
2019-11-21 16:34:30,570 - phyluce_assembly_assemblo_abyss - INFO - Argument --subfolder:
2019-11-21 16:34:30,571 - phyluce_assembly_assemblo_abyss - INFO - Argument --verbosity: INFO
2019-11-21 16:34:30,572 - phyluce_assembly_assemblo_abyss - INFO - Getting input filenames and creating output directories
2019-11-21 16:34:30,578 - phyluce_assembly_assemblo_abyss - INFO - -------------- Processing marmota_baibacina_OB23930 -------------
2019-11-21 16:34:30,579 - phyluce_assembly_assemblo_abyss - INFO - Finding fastq/fasta files
2019-11-21 16:34:30,582 - phyluce_assembly_assemblo_abyss - INFO - File type is fasta
Traceback (most recent call last):
  File "/home/kendall/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_abyss", line 323, in <module>
    main()
  File "/home/kendall/miniconda2/envs/phyluce/bin/phyluce_assembly_assemblo_abyss", line 295, in main
    reads = get_input_files(dir, args.subfolder, log)
  File "/home/kendall/miniconda2/envs/phyluce/lib/python2.7/site-packages/phyluce/raw_reads.py", line 122, in get_input_files
    raise IOError("The appear to be multiple files for R1/R2/Singleton reads")
IOError: The appear to be multiple files for R1/R2/Singleton reads

I just noticed I am also getting a "git fatal: not a git repository" in the log. Could that be a problem too? I just downloaded phyluce for the first time last week, so I think everything should be up to date.

brantfaircloth commented 4 years ago

ok, final question - can you manually rename the directory for that test individual and the reads in it (or copy the whole thing then rename). basically, you want to remove the letters and numbers that follow the genus_species? i’m not sure if that is causing the problem. also, the git thing shouldn’t cause a problem. you can clear the error by installing git if you want.

kkmills commented 4 years ago

I removed the identifier following genus_species from the directory and every file in it (and updated the config file) -- still got the same error.

brantfaircloth commented 4 years ago

ok - can you package up reads for that individual and send those and the config files you are using to me? a dropbox link or similar would be great. i’ll see if i can figure out what’s up.

kkmills commented 4 years ago

Here is a Dropbox link Thanks!

brantfaircloth commented 4 years ago

So, I'm not sure what's going on. First question, did you add a command --subfolder as an option when you ran phyluce_assembly_assemblo_abyss? It seems like you may have not included this (based on what's above).

You either need to include this as something like --subfolder split-adapter-quality-trimmed when you run phyluce_assembly_assemblo_abyss (telling the program that data for each taxon will be in this subfolder) or exclude the --subfolder option flag, and include the split-adapter-quality-trimmed directory in each of your config file line like so:

[samples]
marmota_baibacina_OB23930:/home/kendall/marmot-uces/test-marmot/marmota_baibacina_OB23930/split-adapter-quality-trimmed

When you do that, things seem to work on my end.

kkmills commented 4 years ago

Adding the subfolder flag didn't change anything, but updating the directories in the config file fixed it. Thank you!!