hongcui / FNATextProcessing

producing clean FNA input
1 stars 3 forks source link

Files missing - merged files in Volume 6 #24

Open bibilujan opened 6 years ago

bibilujan commented 6 years ago

Hypericum mutilum subsp. mutilum is not in any of the coarse grained xmls BUT the information is, it seems like the text corresponding to Hypericum mutilum subsp. mutilum is found in the Hypericum mutilum treatment file, 158.xml.

For reference:

  1. Hypericum mutilum http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=242416683
  2. Hypericum mutilum subsp. mutilum http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=250013313

A similar error was found where there is no file for Cucurbita foetidissima, instead the file for the genus Cucurbita (79.xml) contains the text for both, merged. For reference:

  1. Cucurbita http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=108644
  2. Cucurbita foetidissima http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=200022618