Closed SchwarzEM closed 4 years ago
And, after I posted this, I realized naming the script "falcon_name_fastq.pl" rather than "...fasta.pl" would have made considerably more sense! Please feel free to correct my naming. It doesn't affect the script's operation; I've used it and it worked (at least in my hands).
Seems pretty useful. When we have more time next week, we'll copy that into a user-contributions directory. Thanks!
Great!
When you get a chance, please run the following one-liner (or whatever Pythonic etc. equivalent you would prefer) on the file that I submitted:
cat falcon_name_fasta.pl.txt | perl -ne ' s/falcon_name_fasta/falcon_name_fastq/g; print; ' > falcon_name_fastq.pl ;
That should give you a Perl script with a Perl-ish file suffix, in which my typo of 'fasta' has been corrected both in the filename and in the script itself.
Thank you for your script @SchwarzEM it works well on fasta. I have not tried it on fastq
Wow! Good to see that the script is still useful, 4.5 years later...
I previously encountered a problem (described in #249) where I could not assemble reads that had been error-corrected with PBcR-MHAP because they had had their names changed, and FALCON could not accept the altered names (and thus gave me the complaint, "Line 1: Pacbio header line format error"). With useful advice from Jason Chin, I devised a Perl script that will take an arbitrary FastQ reads file and rename its reads so that they are FALCON-usable, while leaving the sequence of the reads unchanged. It will also generate a name-conversion table, so that (if needed) identities of reads can be tracked.
I'm posting the script here, so that others don't have to devise the same script in order to overcome this problem.
Usage of the script is as described in its default help message:
The script itself is as follows:
Attached version is here:
falcon_name_fasta.pl.txt