PacificBiosciences / kineticsTools

Tools for detecting DNA modifications from single molecule, real-time sequencing data
19 stars 21 forks source link

IPDsummary with multiple chemistries #57

Closed clarepacini closed 2 years ago

clarepacini commented 6 years ago

Hi, I am wondering if IPDsummary is able to take reads from different chemistries? The bam file header indicates the different binding kit and sequencing kit information for the read groups. Is this information checked for each read group or are the sequencing chemistries assumed to be the same for the full bam file?

Many thanks,

Clare

hisakatha commented 2 years ago

I would also like to get an answer from the developers.

I tested ipdSummary in SMRT Link v6.0.0.47841 and confirmed that predicted IPDs (modelPrediction in HDF5 files) were based on the majority of multiple chemistries. Therefore, calculated IPD ratios from the minority of the multiple chemistries seem to be biased, unless IPD ratios are calculated separately from shown predicted IPDs. If ipdSummary does not support multiple chemistries now, I would like to ask whether ipdSummary has never supported multiple chemistries until now, or once it supported and then dropped the function. Because some published papers seem to have used multiple chemistries for modification detection, I would like to know the past functionality.

For your information, I have found a function https://github.com/PacificBiosciences/kineticsTools/blob/665546178ea9284ce7718f204ad2364afbe26514/kineticsTools/ReferenceUtils.py#L127-L131 but I do not know the detail of the entire flow.

rhallPB commented 2 years ago

Mixing chemistries has never been supported. In general chemistries are updated so infrequently that experiments rarely include data from mixed chemistries. Mixing chemistries in any secondary analysis, is not recommended.