isugifNF / polishCLR

A nextflow pipeline for polishing CLR assemblies
https://isugifnf.github.io/polishCLR/
16 stars 4 forks source link

Require PacBio CLR data to be bam files #28

Open j23414 opened 2 years ago

j23414 commented 2 years ago

If PacBio data is passed in as a fasta file, the @RG annotation is lost, and Arrow gccp will complain

gcpp ERROR: [pbbam] read group ERROR: basecaller version is too short

Maybe can check for .fasta or fa extension and print a warning. Or hack an acceptable @RG annotation...

j23414 commented 1 year ago

Could add guard-rails here to throw an error if the user passes in a fasta file instead of the required bam file:

https://github.com/isugifNF/polishCLR/blob/d1a1b9f8b08132a6f5715f774c0ddb915983bdce/main.nf#L227-L236