haowenz / chromap

Fast alignment and preprocessing of chromatin profiles
https://haowenz.github.io/chromap/
MIT License
189 stars 20 forks source link

estimating --min-frag-length #3

Closed Maarten-vd-Sande closed 3 years ago

Maarten-vd-Sande commented 3 years ago

Chromap looks awesome based on the preprint!! :1st_place_medal:

If I understand correctly, for chromap to work correctly I would have to set a --min-frag-length so that the index can be built appropriately. How do I estimate this beforehand? I am trying to add chromap to "my" pipeline (seq2science), and I would like for this setting to be set automatically. It generates loads of QC, so could I just look for the smallest read in my sample, and use that? What would you recommend?

haowenz commented 3 years ago

For ChIP-seq, it is okay to use read length as an estimation of min-frag-length. For ATAC-seq, since the open chromatin regions can be very short (e.g., ~30 bp), the sequencing may end up with many fragments at that short length, which is even smaller than the read length (e.g., 50 bp or 100 bp). In this case, you can set the min-frag-length to 30. Overall, in most cases, you can use read length or a value smaller than that based on your experimental protocols as min-frag-length.

Maarten-vd-Sande commented 3 years ago

Thanks for the fast reply!

In case of a paired-end chip-seq, do I take the sum of the two reads, or just take the length one of the two?

haowenz commented 3 years ago

Thanks for the fast reply!

In case of a paired-end chip-seq, do I take the sum of the two reads, or just take the length one of the two?

You can use the length of one end.

Maarten-vd-Sande commented 3 years ago

Clear! Thanks