CottageLabs / idfind

An identifier identifier
1 stars 0 forks source link

recording multiple successful tests #33

Closed emanuil-tolev closed 12 years ago

emanuil-tolev commented 12 years ago

If I have two tests with the following regexes:

  1. ([0-2]+)
  2. ([0-9]+)

And I try to identify "101", both will succeed. We display both successful results to the user, but only actually record the first successful result when creating the "identifier" doc in the ES index.

The structure of the "identifier" doc could be changed - instead of a dictionary with the properties of the first successful test like "name", it could be a dictionary with 3 keys: identifier (the string the user tried to identify), id (unique id, same as _id on the top level of the ES document), what.

"what" would be a list of possibilities - this is what this identifier could be. So if we have 1 successful test during identification, "what" would only have 1 element - that would be a dictionary with "name", "url_prefix", "url_suffix", "tags", "regex" and so on. However, if multiple tests succeed in identifying the string, the list would have more elements, and each one of them would be a dictionary with "name" and so on.

"what" would basically be a list of what the identifier string could be.

Right now we're losing information (successful identifications when multiple tests succeed), so even if the solution described is not perfect, it will at least preserve all information from the identification process.

emanuil-tolev commented 12 years ago

Commits 4fa49fd2812cc3cf37611dd8d95e3874129db7c5 and 8ae3e3a73323267269873c650c9ec13940cac3c5 (in that order) implement the suggestion above.