WGLab / RepeatHMM

a hidden Markov model to infer simple repeats from genome sequences
Other
34 stars 14 forks source link

apply to exom sequence data #11

Closed tlu0918 closed 6 years ago

tlu0918 commented 6 years ago

Dear RepeatHMM

I have a question regarding RepeatHMM data on exome sequence data. I currently have exome sequence data genotyped with the Illumina HumanExome BeadChip and I would like estimate the counts of CAG repeat in chr4 HTT gene.

Can I directly apply RepeatHMM to my exome data ? Or I should increase the dense of my data via imputation such as IMPUTE2 before the application? My data is in Plink bed format. Could you please suggest me a tool to covert to BAM format. Thanks!

Best,

Ake

kaichop commented 6 years ago

RepeatHMM is designed for PacBio/Nanopore data. It cannot handle your data.

On Mon, Apr 2, 2018 at 4:15 PM, tlu0918 notifications@github.com wrote:

Dear RepeatHMM

I have a question regarding RepeatHMM data on exome sequence data. I currently have exome sequence data genotyped with the Illumina HumanExome BeadChip and I would like estimate the counts of CAG repeat in chr4 HTT gene.

Can I directly apply RepeatHMM to my exome data ? Or I should increase the dense of my data via imputation such as IMPUTE2 before the application? My data is in Plink bed format. Could you please suggest me a tool to covert to BAM format. Thanks!

Best,

Ake

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGLab/RepeatHMM/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AFptuL7abppSNIBjhAEhdciYng8vcx3mks5tkobNgaJpZM4TEBgT .

tlu0918 commented 6 years ago

Dear Kai

Thank you very much for your quick response. Hopefully we will have the data in the future to apply RepeatHMM.

I have one more question. Can RepeatHMM be applied to a SAM file with FLAG field missing?

Thanks!

Ake

liuqianhn commented 6 years ago

Hi @tlu0918 , RepeatHMM uses standard SAM format. An input with no FLAG field might cause an error. An input with improper FLAG field would not cause an error and the results might be not guaranteed. If re-alignment option is used, the improper FLAG field would have little effect since reads would be re-aligned.

tlu0918 commented 6 years ago

Hi

Thank you so much for the clarification!

Best

Ake