faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
80 stars 49 forks source link

unmix fasta reads errors #247

Open michaelforthman opened 3 years ago

michaelforthman commented 3 years ago

Hi Brant,

I'm getting some errors when attempting to unmix fasta reads, with error messages varying between different phyluce versions.

1.7.1 and 1.7.0 error:

Traceback (most recent call last): File "/apps/phyluce/1.7.1/bin/phyluce_utilities_unmix_fasta_reads", line 155, in main() File "/apps/phyluce/1.7.1/bin/phyluce_utilities_unmix_fasta_reads", line 116, in main temp_handle2, ">{}\n{}\n".format(record.description, record.seq) TypeError: a bytes-like object is required, not 'str'

Earlier versions error:

Traceback (most recent call last): File "/apps/phyluce/20190308/bin/phyluce_utilities_unmix_fasta_reads", line 147, in main() File "/apps/phyluce/20190308/bin/phyluce_utilities_unmix_fasta_reads", line 117, in main os.remove(temp_name1) UnboundLocalError: local variable 'temp_name1' referenced before assignment

Below is an example of how my fasta files are formatted and my job code: `

SRR2051471_67/1 CGGCAATTTTTATTTTCGCCTGTTTAACAAAAACATGTCCTATAGATTTAATTTTAGGTTTAATCTGCTCAGTGATAATTTATAATTAAATAGCCGCAGTATTTTGACTGTGCAAAGGGAGCATAATCATTTGTCTTTTAATTTAAGGCT SRR2051471_168/1 TTCTCGTCTAACAAAAAAATTTTAGCATTTTAACTAAAAATTTAAATTCAAAATACTCAAGTAAAGAAAGTCTATTTTTCGTCCAATCATTCATACAAGCCTTCAATTAAAAGACAAATGATTATGCTACCTTTGCACAGTCACAATACT `

phyluce_utilities_unmix_fasta_reads --mixed-reads /blue/cwmiller/mforthman/coreoidea_phylogenomics_part2/genomic_transcriptomic_reads_for_mapping/quorum_output/Arocatus_melanocephalus_SRR2051471_mapped_reads_corrected.fasta --out-r1 /blue/cwmiller/mforthman/coreoidea_phylogenomics_part2/genomic_transcriptomic_reads_for_mapping/quorum_output/Arocatus_melanocephalus_SRR2051471_mapped_corrected_R1.fasta --out-r2 /blue/cwmiller/mforthman/coreoidea_phylogenomics_part2/genomic_transcriptomic_reads_for_mapping/quorum_output/Arocatus_melanocephalus_SRR2051471_mapped_corrected_R2.fasta --out-r-singleton /blue/cwmiller/mforthman/coreoidea_phylogenomics_part2/genomic_transcriptomic_reads_for_mapping/quorum_output/Arocatus_melanocephalus_SRR2051471_mapped_corrected_singles.fasta

Is there an issue with my fasta file or code that I'm not seeing?

Thanks, Michael

michaelforthman commented 3 years ago

Also, I have ">" at the beginning of each fasta header. For some reason github keeps interpreting it as a quote and removes it.

brantfaircloth commented 3 years ago

Part of this is an issue with moving to python3. That said, I would use another tool to split these reads - I'm likely to just remove this code in an updated version. I'd look into seqtk (should come w/ phyluce), bbmap, or you could use something like galaxy (https://galaxyproject.org/support/ncbi-sra-fastq/#interleaved-forward-and-reverse-reads).

For example, in seqtk, you would do something like:

seqtk seq interleaved.fq -1 >  read_1.fq
seqtk seq interleaved.fq -2 >  read_2.fq