jyhehir / mobster

For more details about Mobster please see
https://jyhehir.github.io/mobster/index.html
GNU General Public License v3.0
9 stars 9 forks source link

repetmask format file #58

Closed adriludwig closed 1 year ago

adriludwig commented 1 year ago

Hi! I'm working with Drosophila and I would like to include the repmask file to eliminate the "old" insertions while using Mobster. I noticed that there are two additional columns, bin and id. I was not able to guess what are exactly these columns. Can you give me more information about that? Thanks very much!

ramonamerong commented 1 year ago

I am not sure either, but for Mobster only the 'genoName', 'genoStart', 'genoEnd' and 'repFamily' columns are of importance, although they must be the 6,7,8 and 13th column in the repMask file for Mobster to work!

However, Mobster is not designed to use a repeatmask file for another organism, so it is better to filter predictions afterwards yourself. Also see point 4. and 5. at the bottom of the documentation: https://jyhehir.github.io/mobster/documentation.html

adriludwig commented 1 year ago

Thanks for the information! I was doing some tests, but it seems I'll not be able to use the "real" Drosophila repmask file to eliminate reference insertions because most of them are not from the 4 TE groups that mobster will look for.