PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
243 stars 44 forks source link

The *.flnc_count.txt output from isoseq collapse does not generate columns for multiple samples #676

Closed YanqingHuang01 closed 1 month ago

YanqingHuang01 commented 3 months ago

I followed the steps in this page: https://isoseq.how/clustering/examples.html for multiplexed samples.

Demux and primer removal $ lima --isoseq --peek-guess *.hifi_reads.bam IsoSeq_v2_primers_12.fasta output.bam

Combine inputs $ ls output.IsoSeqX*bam > all.fofn

Remove poly(A) tails and concatemer $ isoseq refine all.fofn IsoSeq_v2_primers_12.fasta flnc.bam --require-polya

Cluster2 $ isoseq cluster2 flnc.bam clustered.bam

Mapping $pbmm2 align --preset ISOSEQ --sort

Collapse $isoseq collapse --do-not-collapse-extra-5exons

However, the output file *.flnc_count.txt doesn't contain columns for multiple samples. It only generates a column labeled 'BioSample_1' instead of the expected layout:

id BioSample1 BioSample2 PB.1.1 2 2 PB.2.1 1 2 PB.3.1 1 1

eprdz commented 2 months ago

For me it's the same. Any updates on this issue?

jmattick commented 1 month ago

Hi @YanqingHuang01. Thanks for catching an issue with our docs. The example is missing the --overwrite-biosample-names option on the cDNA demux step.

https://isoseq.how/clustering/cli-workflow.html#step-2---primer-removal-and-demultiplexing