CatalogueOfLife / general

The Catalogue of Life
49 stars 5 forks source link

inquiry about cp_name_match function #90

Closed YaquanChang closed 3 years ago

YaquanChang commented 3 years ago

Dear Markus Döring,

Thanks for publishing this package, it helped a lot to get accepted species names. Now I have a question related to cp_name_match function. Does the type "variant" mean synonym names? If yes, I found some species although labeled "variant" using cp_name_match function, still showed as the accepted species in the catalog of life website.

For example: "Erechtites valerianifolius", "Carex brownii", and "Eragrostis brownii". when using cp_name_match function, they return to "Carex brownei", "Eragrostis brownei", and "Erechtites valerianifolia". But in the catalog of life website, their names are still accepted.

Thanks in advance for your time,

Best, Yaquan

mdoering commented 3 years ago

Hi, I assume you are talking about the R package which I have not written myself nor used. I believe it is making 2 subsequent calls to the API. First a match to the COL ChecklistBank NamesIndex, then a second one to find the latest COL Checklist name/taxon.

The names index matching does a rather strict pure name matching and returns also a match type which can either be:

  /**
   * The canonical name and authorship (if given) matches exactly
   */
  EXACT,

  /**
   * The name matches an orthographic variant of the name, authorship and/or rank (for family and above)
   * which is considered to be the same name still.
   */
  VARIANT,

  /**
   * Name matched to canonical name only even though it had an authorship, but could not be inserted.
   * Can only happen if insertion is not allowed, e.g. via external API requests.
   */
  CANONICAL,

  /**
   * The name matched several names and could not be clearly disambiguated.
   * Usually only happens for canonical monomials without authorship.
   */
  AMBIGUOUS,

  /**
   * No matching name.
   */
  NONE;

So in that regard "variant" only means the input name slightly differs from the name kept in the names index. It is not related to the name COL accepts which might be the same or slightly different again.