PengNi / ccsmeth

Detecting DNA methylation from PacBio CCS reads
BSD 3-Clause Clear License
73 stars 11 forks source link

Question:types of input, old data or just HiFi? #13

Closed mikeds211 closed 2 years ago

mikeds211 commented 2 years ago

This looks exactly what I have been searching for, for some time. I have old sequel2 data but these are not the more recent HiFi. Before I invest my time in installing and testing this, can anyone tell me if this accepts normal, old sequel2 output files (i.e. non-HiFi files)?

PengNi commented 2 years ago

Hi @mikeds211 , sorry for the delay! Since v0.2.0, ccsmeth only accepts HiFi reads as input. However, I think the old sequel2 output files can be transformed to HiFi reads using the PacBio tool pbccs.

Best, Peng

mikeds211 commented 2 years ago

Thanks for the tip. Unfortunately pbccs is not available in conda for mac os. Only the linux version is available. That took me some time to work out. I will look around to see if there is some alternative.

mikeds211 commented 2 years ago

It is not clear that this program will only work under linux. I was hoping to use it under Mac OS but while many of the pacbio tools are available on both linux and mac os, the pbccs tool is currently only available for linux. I have used pbccs in ubuntu within virtualbox to convert sequel2 data to HiFi, but now I realise that pbccs is also required for ccsmeth to run. To me, that means it is linux only. Is that correct?

PengNi commented 2 years ago

@mikeds211 , yes, some steps of ccsmeth requires pacbio tools, which means it is linux only.

Best, Peng

mikeds211 commented 2 years ago

OK, so now I've got linux running under virtualbox on my iMac, and the installation of ccsmeth went smoothly. I could convert my old sequel2 file to HiFi using ccs but "ccsmeth call_hifi" command (as shown in the code page) fails with the error, "FATAL | ccs ERROR: Missing base features for output kinetic mode: IPD or PulseWidth". Which probably means I needed to convert my sequel2 file to HiFi with one of the kinetics flags '-all-kinetics' or '-hifi-kinetics'.
Question: Any ideas which one of the two types of flags I should use? I had thought the basic ccs command would retain the kinetic data in the original .bam file but I am guessing now that I have to process with the kinetic flag added. As an aside, I also have pacbiotools running in virtualbox too, so am hoping to be able to compare the results of ccsmeth with the pacbio kit.

PengNi commented 2 years ago

@mikeds211 , thank you very much for trying to use ccsmeth.

Best, Peng

PengNi commented 2 years ago

Also, I don't know if you have tried container (like docker or singularity) to run linux tools on Mac. I think it is more flexible than virtualbox.

mikeds211 commented 2 years ago

thanks for the update. I will wait until a stable release. I have used docker before but I have a stable ubuntu linux in virtualbox and pacbiotools is running nicely on my iMac. My DNA sequence data is bacterial, from widely different genera, so there are all types of base modifications present.

PengNi commented 2 years ago

model for call_mods module released.