bahlolab / PLASTER

Nextflow pipeline for long amplicon typing of PacBio SMRT sequencing data
MIT License
2 stars 3 forks source link

Error when running with own configuration #17

Closed cyriltata closed 2 years ago

cyriltata commented 2 years ago

After successfully running with the test configuration, I now tried running pre-processing with my own config and bam files and I get this fatal error

 FATAL | ccs ERROR: Missing base features: IPD or PulseWidth

any ideas why this is?

jemunro commented 2 years ago

Sounds like an issue with the input subreads bam file. See https://github.com/PacificBiosciences/pbbioconda/issues/432#issuecomment-913427865.

cyriltata commented 2 years ago

@jemunro I have bam files that have not been demultiplexed. What can I do to generate bam reads with pw and ip tags?

jemunro commented 2 years ago

The pipeline is designed for Sequel subreads output - can you run: samtools view <input.bam> | cut -f1,12- | head -1 and share the ouput to confirm the format of the pacbio reads?

Are able to get raw subreads output file and try that? (i.e. the file output from the sequel run).

cyriltata commented 2 years ago

Running samtools gives this output

/bam > samtools view demultiplex.M13_bc1001_F--M13_bc1049_R.bam | cut -f1,12- | head -1
m64187e_220214_111831/461574/ccs        bx:B:i,37,36    ec:f:55 np:i:55 rq:f:1  sn:B:f,9.69447,14.7193,3.3817,6.14301   we:i:9056538    ws:i:339056     zm:i:461574     qs:i:37 qe:i:114        bc:B:S,0,32     bq:i:93 cx:i:12 bl:Z:GGTAGACACGTGTGCTCTCTCCGGAAACAGCTATGAC   bt:Z:ACTGGCCGTCGTTTTACCGCACTCTGATATGTGCTA       ql:Z:~~~~~~~~~~~~~~~~~~~~~~U~~~~~~~~~~~~~~      qt:Z:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~       RG:Z:34e88679/0--32
jemunro commented 2 years ago

Hi Cyril,

Your input file contains CCS reads not subreads, which unfortunately is not a compatible with the pipeline. You will need to obtain the original subreads BAM file in order to use the pipeline.