Closed Biomedinformatics closed 3 years ago
Thank you for your report, user Biomedinformatics.
why is it showing adaptor when I already trimmed it and how can I get rid of the contaminated reads from fastq files?
The expectation is that you either remove the contigs from your input FASTA files or edit the specified regions out.
If you would like to contest the choice of adaptor or contaminant sequences, feel free to examine our databases for adaptors that we provide as BLAST database and FASTA file of adaptors and contaminants as part of the reference package.
You should have in your directory something like input*
subdirectory. It should contain two things: dir contam_in_prok_blastdb_dir
with BLASTdb indexes and adaptor_fasta.fna
file.
Please let me know if this helps.
@azat-badretdin Thank you for your reply. I have removed those contigs showing contamination and now PGAP is running (yet to complete). I just want to know if this Assembly can be submitted to NCBI or do I need to work on fastq files and again assemble or only removing contaminated contigs is enough?
I just want to know if this Assembly can be submitted to NCBI
If it successfully completes, I do not see why not. Disclaimer: I know little about SOPs of submission unit in GenBank.
Azat is correct. If you have removed or replaced with Ns the contaminated spans you can submit to GenBank the assembly fasta and the .sqn file that PGAP produces .
Thank you for all your support. PGAP is completed successfully.
You are very welcome! Thank you for reporting the issue!
I have paired-end fastq files for bacterial genome. I trimmed it using trim_galore to get rid of adaptors and low quality reads. Then performed assembly using spades. Finally when I tried to annotate using PGAP it gives error and calls.tab file shows : lcl|NODE_192_length_376_cov_0.943144 M 138..299 adaptor:multiple Adaptor
lcl|NODE_1390_length_253_cov_0.823864 M 204..253 adaptor:NGB01064.1 Adaptor
lcl|NODE_1563_length_248_cov_0.812865 M 88..121 adaptor:NGB00749.1 Adaptor
lcl|NODE_1590_length_247_cov_0.852941 M 216..247 adaptor:NGB00749.1 Adaptor
lcl|NODE_2102_length_234_cov_0.929936 X - adaptor:multiple Adaptor
why is it showing adaptor when I already trimmed it and how can I get rid of the contaminated reads from fastq files?