Open yroskov opened 2 years ago
Markus: we have ICTV data in DwC-A format; this format has no rank "realm"; data should be in CoLDP to get classification with realms.
A first ColDP version of ICTV is placed on dev: https://data.dev.catalogueoflife.org/dataset/1014/classification Metadata is still wrong, working on that still.
Otherwise there should be all ranks as they exist in the MSL including realms, subrealms and a single Viruses root. There are quite a few unplaced families and other taxa, but you probably have seen that already. Also in the latest ICTV release type species were removed
on prod now too: https://data.catalogueoflife.org/dataset/1014/about
Many thanks @mdoering!
I have checked the version on PROD, all are fine: https://github.com/CatalogueOfLife/testing/issues/9#issuecomment-1035214242) and now do re-assembly (deletion of old sectors is very slow)
@mdoering, I run an experiment on modifying top level of the CoL Tree to accommodate higher classification of viruses https://github.com/CatalogueOfLife/testing/issues/182.
Seems, Tree modification was successful - it now includes Biota and Viruses as two top points.
[ ] Change need to be done in CoLDP ICTV dataset: we need to include original "realm" rank in the dataset (rank above kingdom). In the current version of CoLDP dataset we have no realms. Could you please modify your script for ICTV?
[ ] It would be helpful to have single top point as "unranked Viruses" in CoLDP ICTV dataset. In this case, ICTV will have a single point of attachment in CoL, and all future changes in virus classification will go to CoL without re-assembly.
ICTV: