Closed hoelzer closed 1 year ago
We don't have to necessarily update the vpHMM_database_v3.tar.gz
file, we can simply update the additional_data_vpHMMs_v3.tsv
to a new v4
. @guille0387 has that.
This then needs to be uploaded to the EBI FTP @mberacochea so that we can add and access it here:
https://github.com/EBI-Metagenomics/emg-viral-pipeline/blob/dev/nextflow/modules/metaGetDB.nf#L25
And then we switch to v4
of this metadata file in the config:
https://github.com/EBI-Metagenomics/emg-viral-pipeline/blob/dev/nextflow.config#L51
and done.
By that, we would then use the v3 of the HMMs but the v4 of the metadata file, removing outdated HMMs from the taxonomy assignment step.
Alright, that sounds like a plan. Should we put a warning message for the metadata v3?.
Ping me when the v4 metadata file is ready and I'll make the required changes.
Yes, good idea. @guille0387 we could put a warning message that v4 does not include the following discontinued virus taxa (according to ICTV) anyomore and then lost them:
Siphoviridae Podoviridae Myoviridae Caudovirales Allolevivirus Autographivirinae Buttersvirus Chungbukvirus Incheonvirus Leviviridae Levivirus Mandarivirus Pbi1virus Phicbkvirus Radnorvirus Sitaravirus Vidavervirus
(Pls double-check that list)
hey @mberacochea , I just sent you an email with the updated metadata file v4
Should be solved for now with the merge of PR https://github.com/EBI-Metagenomics/emg-viral-pipeline/pull/103
This needs to be done, @guille0387 detected which ViPhOGs belong to discontinued viral taxa such as the families Siphoviridae, Myoviridae, ...
Re-calculating the models is not so easy, but for now, we can simply remove these old models (which are not that many) from the ViPhOG database.
@guille0387 I think you can provide a list of which models need to be removed. And then, we can update the ViPhOG database file (currently
vpHMM_database_v3.tar.gz
and make v4?) and the pipeline accordingly?We should then also update the data here: https://osf.io/fbrxy/ which is linked in the manuscript