NIAID-Data-Ecosystem / nde-crawlers

Harvesting infrastructure to collect and standardize dataset and computational tool metadata
Apache License 2.0
0 stars 0 forks source link

[Augmentation] Enable record-level override of augmented metadata #113

Open gtsueng opened 8 months ago

gtsueng commented 8 months ago

Metadata augmented via NLP tools such as PubTator are bound to have context-dependent erroneous entries. For example, PERCH in one record may be some sort of technique that Pubtator will extract as a fish, and in another record PubTator will correctly extract perch as a fish.

The augmented metadata is currently stored in a SQLite database as a table relating terms and ontology ids. A table with record ids will be needed