tseemann / barrnap

:microscope: :leo: Bacterial ribosomal RNA predictor
GNU General Public License v3.0
221 stars 40 forks source link

Replace SILVA databases with unencumbered-licence source #13

Open tseemann opened 9 years ago

tseemann commented 9 years ago

@satta has been trying to package Barrnap into Debian-Med but has reported that the SILVA alignments (23S) have a licence with is incompatible with Debian.

(It's only free for academic/non-commerical: http://www.arb-silva.de/silva-license-information/ )

Goal would be to construct new 23S alignments from Refseq and build our own models.

satta commented 8 years ago

FYI, I have started to work on this a bit. Please find in https://github.com/satta/barrnap/tree/build_own_hmms/build a version of barrnap with a changed build pipeline which

This replaces the SILVA step completely and results in a new set of HMMs (committed in the branch as well) which mostly results in identical matches in the example data. Sometimes there are slightly different start positions with about 1bp deviation and slightly different score values. Only few hits are missed completely (2 in the fungal set). You can take a look and use the compare_results.lua script (needs gt) to compare old and new results.

I'd be happy to get some suggestions if you can think of any improvements. Preprocessing or filtering the raw RefSeq downloads comes to mind, but I'm not an expert on these RNAs to make any judgment calls there.

Thanks, Sascha

wwood commented 7 years ago

Hi there, I'm similarly interested, for reasons (packaging for GNU Guix). Has there been any update? Given the small differences you observed @satta, would it make sense to your HMMs as an official alternative to SILVA? Thanks!

satta commented 7 years ago

Hi @wwood, sorry I missed your comment. Given my lack of practical experience as a user, I would probably need some more tests and/or confirmation by an expert that the models are an alternative to the SILVA ones. No idea how 'bad' missing results are for a end user. Hence my request for @tseemann comments. In Debian we currently do not ship the SILVA derived HMMs.

tseemann commented 6 years ago

@satta I think this problem may be solved soon?

https://www.arb-silva.de/silva-license-information/

Change of SILVA license model for commercial users in Fall 2018 - free of cost for any usage.

With the next full database release which is expected for Fall 2018, the SILVA project will resign the current dual licensing model and the SILVA datasets will become free also for commercial/non-academic users. With this change SILVA is following the recommendations of an Opens external link in new windowELIXIR Core Data Resource.

satta commented 6 years ago

Good news! I guess we can then finally ship all HMMs. :)

tseemann commented 6 years ago

Since then everyone has moved to bioconda :P

sfehrmann commented 5 years ago

they certainly want to ensure the license transition is carried out thoroughly

Change in SILVA license model

expected for Summer 2019

bwlang commented 3 years ago

i think this can be closed... silva is now fully open.