jydu / maffilter

The MafFilter genome alignment processor
GNU General Public License v3.0
17 stars 5 forks source link

Use WindowSplit with an overlapping Window #14

Closed JohnBioinf closed 2 years ago

JohnBioinf commented 3 years ago

I would like to split a maf file in to blocks of a specific size which are overlapping. So with intervals like 1-100, 50-150, 101-200, ...

In the documentation 2.1.8 it reads: It is possible to generate overlapping windows.

Does this refer to the scenario that I am referring and if so how can I achieve this.

Kind regards John

jydu commented 3 years ago

Dear John,

Yes, this refers to the scenario you are referring to, but unfortunately this is an error in the documentation, as this feature is currently not implemented :s I sincerely apologize for that. Adding this feature, however, should not be too much hassle. If you are not too much in a hurry, I can give it a go.

Julien.

JohnBioinf commented 3 years ago

Yes this would be an absolute blast. I am currently making some estimates for the time complexity of the algorithm, we want to run on the maf blocks. And it looks like we need to split bigger blocks. And this would be than preferably done with a sliding window. If we really need the splitting we most definitely cite mafFilter, and even more so if the sliding window feature is implemented. Further I think the sliding window should be a useful feature for others.

jydu commented 3 years ago

Hi John,

I added an option window_step (default equal to preferred_size) to add support for overlapping windows. For now it only works with align=ragged_left. To try it, you need to update bpp-seq-omics and maffilter from the git repository (master branch). As far as I can see it seems to be working. Please let me know if you need further help, or if you find some issues.

All the best,

Julien.