Closed nleguillarme closed 3 years ago
You can configure the location of the taxonMap / taxonCache in the nomer properties.
to list properties:
$ nomer properties
nomer.append.schema.output.example.taxon.rank.order=[{"column":0,"type":"path.order.id"},{"column": 1,"type":"path.order.name"},{"column": 2,"type":"path.order"}]
nomer.append.schema.output=
nomer.cache.dir=./.nomer
nomer.doi.cache.url=
nomer.doi.min.match.score=100
nomer.eol.taxon=gz:https://zenodo.org/record/3834881/files/taxon.tab.gz!/taxon.tab
nomer.itis.synonym_links=gz:https://zenodo.org/record/3833105/files/synonym_links.gz!/synonym_links
nomer.itis.taxon_unit_types=gz:https://zenodo.org/record/3833105/files/taxon_unit_types.gz!/taxon_unit_types
nomer.itis.taxonomic_units=gz:https://zenodo.org/record/3833105/files/taxonomic_units.gz!/taxonomic_units
nomer.ncbi.merged=tar:gz:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz!/taxdump.tar!/merged.dmp
nomer.ncbi.names=tar:gz:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz!/taxdump.tar!/names.dmp
nomer.ncbi.nodes=tar:gz:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz!/taxdump.tar!/nodes.dmp
nomer.nodc.url=tar:gz:https://www.nodc.noaa.gov/cgi-bin/OAS/prd/download/50418.1.1.tar.gz!/50418.1.1.tar!/0050418/1.1/data/0-data/NODC_TaxonomicCode_V8_CD-ROM/TAXBRIEF.DAT
nomer.plazi.treatments.archive=https://github.com/plazi/treatments-rdf/archive/master.zip
nomer.pmid2doi.cache.url=ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/PMC-ids.csv.gz
nomer.schema.input=[{"column":0,"type":"externalId"},{"column": 1,"type":"name"}]
nomer.schema.output=[{"column":0,"type":"externalId"},{"column": 1,"type":"name"}]
nomer.taxon.name.correction.url=https://github.com/globalbioticinteractions/globi-taxon-names/raw/main/taxon-name-mapping.csv
nomer.taxon.name.stopword.url=https://github.com/globalbioticinteractions/globi-taxon-names/raw/main/non-taxon-words.txt
nomer.taxon.rank.cache.url=
nomer.taxon.rank.map.url=
nomer.term.cache.url=https://zenodo.org/record/3992313/files/taxonCache.tsv.gz
nomer.term.map.maxLinksPerTerm=125
nomer.term.map.url=https://zenodo.org/record/3992313/files/taxonMap.tsv.gz
and then, you can override properties by providing your own:
$ nomer --properties some.properties append ...
or
$ nomer -p some.properties append ...
where the some.properties
file contains some properties you like to change. In your case, this might be:
nomer.term.cache.url=file:///home/nle/mydata/taxonCache.tsv.gz
nomer.term.map.url=file:///home/nle/mydata/taxonMap.tsv.gz
If this works for you, please close the issue, otherwise, I'd be happy to hear more.
I can work from that, thank you !
Hi @jhpoelen.
My data integration pipeline uses nomer to map taxon names to db. For performance reasons, I primarily use globi-taxon-cache and I'd like to be able to download the taxon map/cache once when building my app. Is it possible (apart from calling nomer append once to trigger cache download) ?