Sequence Mimics - Githubissues

zpf0117b / CLMB

Contrastive Learning for Metagenomic Binning

MIT License

2 stars 2 forks source link

Sequence Mimics #4

Open millanp95 opened 2 years ago

millanp95 commented 2 years ago

Hi,

Have you tried the probabilistic mimic sequences proposed in https://www.biorxiv.org/content/10.1101/2021.05.13.444008v4 as data augmentation?

zpf0117b commented 2 years ago

Hi, @millanp95

The ideas of FCGRk and probabilistic mimic sequences are inspiring. Sadly, we didn't try the image transformation before data augmentation. We just considered numerical distortion for CLMB because both k-mer frequency and abundances are utilized as the feature data of individual DNA fragments. Your ideas might help upgrade CLMB.