Lifemap-ToL / lifemap-back

Lifemap infrastructure and builder.
GNU General Public License v3.0
0 stars 0 forks source link

List missing taxids in wikidata #19

Closed juba closed 2 months ago

juba commented 2 months ago

Try to list NCBI taxids that are not present in wikidata.

The sparql query to list taxids in wikidata is:

SELECT DISTINCT ?ncbiId WHERE {
    ?taxon wdt:P685 ?ncbiId.
}
juba commented 2 months ago

Script added to builder/scripts/. It must be run on the backend once the lifemap data has been updated, as it reads NCBI taxids list from TreeFeaturesComplete.parquet.

Missing taxids in wikidata are exported to builder_results/missing_wikidata_taxids.parquet.