medema-group / BiG-SCAPE

Similarity networks of biosynthetic gene clusters
GNU Affero General Public License v3.0
69 stars 26 forks source link

cosine distance among BGCs #20

Closed neptuneyt closed 1 year ago

neptuneyt commented 1 year ago

Dear author,

Is there any way to calculate the cosine distance between two BGCs sequence?

I have try the textdistance of Python library, but the result make no sense.

jorgecnavarrom commented 1 year ago

Hi

I'm not really sure, to be honest. Perhaps this could help (see pfam2vec)? https://academic.oup.com/nar/article/47/18/e110/5545735

marnixmedema commented 1 year ago

Not sure either, but @satriaphd has used cosine distances for BiG-SLICE in our recent study found here: https://www.nature.com/articles/s41564-022-01110-2 Perhaps he can comment on the code used to do this.