PathwayCommons / factoid

A project to capture biological pathway data from academic papers
https://biofactoid.org
MIT License
28 stars 7 forks source link

Grounding Assist: Capture hints in paper #1148

Open jvwong opened 1 year ago

jvwong commented 1 year ago

Description

Q: What is the name of the feature?

A: Grounding Assist

Q: What does this feature enable the user to do?

A: Indirectly, disambiguate a name for a bioentity (e.g. gene) more accurately

Q: What information must the user provide to use the feature?

A: (1) Article information (2) names of bioentities

Q: What are the applicable constraints, e.g. compatibility or performance?

A: There main cases to consider:

  1. Default: No prior information is available
  2. Bioentity database identifiers are available
  3. Species information is available

Q: How does this feature affect each class of user (persona)?

A: Synonyms and orthologues account for a large proportion of observed errors (30%). It is conceivable that other types of errors could be mitigated (e.g. spelling issues) and that hints would enable features such as a true "type-ahead" autocomplete.

Specification

Sources of bioentity information

Scoring algorithm

This is to be determined. Should consider:

  1. Location: Prioritization based on mention in title vs abstract vs body
  2. Type: Local hint (e.g. entity database IDs) vs global (e.g. species)
  3. Reliability of source

Tasks

The factoid project should be responsible solely for obtaining bioentity hints for a given article:

At least for network curation, grounding-search should be responsible for scoring search hits in light of hints.

References

jvwong commented 4 weeks ago