nf-core / isoseq

Genome annotation with PacBio Iso-Seq. Takes raw subreads as input, generate Full Length Non Chemiric (FLNC) sequences and produce a bed annotation.
https://nf-co.re/isoseq
MIT License
29 stars 13 forks source link

Conversion from subreads.fastq.gz to subreads.bam #16

Open LEESojung1998 opened 1 year ago

LEESojung1998 commented 1 year ago

Description of feature

I have downloaded subreads.fastq.gz file from SRR. I have tried Picard, sam-dump, reformat.sh to convert to unaligned.subreads.bam file so that I can run CCS. However, seems like none of the tools perfectly convert subreads.fastq to subreads.bam for Pacbio Iso-seq bam format. Is there any suggestion to convert fastq to bam so that it is compatible to CCS?

sguizard commented 1 year ago

Hi, Can you give me the SSR id please?

Kratos12138 commented 11 months ago

I have the same questions that fastq.gz cannot be converted into subreads.bam files and I can't even distinguish whether the fastq files are subreads or ccs reads, can I get some help? Lots of thanks. the SRR id is SRR17180608 to SRR17180617

arslan9732 commented 7 months ago

Hi, Any update on this issue? It seems like most of the Isoseq data on the SRA database is only available in fastq format. Is there any way to make ccs compatible bam files? I tried to use sam-dump from sratoolkit, but got an error:

| 20240423 11:06:24.352 | FATAL | ccs ERROR: [pbbam] dataset ERROR: only PacBio BAMs are supported for fetching chemistry info