For the class media.data.nouns.Place, there should be a standard search matching function designed to compare two different Place values and identify whether or not they are an exact match, or a partial match.
The Place object contains what is considered a major value, which is the most specific portion of the place, and a minor value which is supplemental data that enhances the major value. The keyordlist tool displays a place keyword by showing the major value, and then appending the minor value in parenthesis.
properNoun/place/Tampa (Florida) The Punisher
properNoun/place/Griffith Park Observatory (Los Angeles) Devil In A Blue Dress
La La Land
Rebel Without A Cause
If a user is searching for a place keyword, in code, it should ideally be handled by creating a Place object, and then comparing the values between the search parameter object and the target object being searched.
The three entries below are essentially identical, but not exactly identical.
properNoun/place/Los Angeles Airport Collateral
properNoun/place/Los Angeles Airport (Los Angeles) Once Upon A Time In Hollywood
Speed
properNoun/place/Los Angeles Airport (Los Angeles, California) Into The Night
All 3 are referencing the same place, but the minor details are either missing or different; but they all reference the same place. A match between these values could be considered a possible_exact_match, whereas if all 3 records were 100% identical, they would be considered an exact_match.
However, if the search parameter was for a more wider scope, like "Los Angeles", consider the following values
properNoun/place/Hollywood (Los Angeles) The Aviator
properNoun/place/Hollywood (Los Angeles, California) Babylon
properNoun/place/Hollywood Sign (Los Angeles) Once Upon A Time In Hollywood
properNoun/place/Los Angeles Annie Hall
properNoun/place/Los Angeles (CA) L.A. Confidential
properNoun/place/Los Angeles (California) 52 Pickup
properNoun/place/Los Angeles Airport Collateral
The first three entries could be considered a related match, since they are locations within "Los Angeles", and they might be considered relevant (but they do not meet the definition of a match). The next three entries are essentially referencing the same city, expressed in 3 different forms (which would be possible_exact_match), and the 7th match could be considered a string_pattern_match.
Technically, the 5th and 6th entires in the above example should be considered a stronger match to each other as long as the code understands that CA is the abbreviation for California.
The searching algorithm return matches in the following order.
Exact matches
Possible matches
Related matches
Also, most of the location data is expressed as string values. Considering that XML elements like st and cn now support attribute values to handle abbreviation helpers, states and countries should probably be promoted to actual Python objects.
(That work could be handled in a different ticket).
This enhancement does not require a proof of concept command line tool to facilitate search operations for the end user; but it could be handled in a different ticket.
For the class
media.data.nouns.Place
, there should be a standard search matching function designed to compare two different Place values and identify whether or not they are an exact match, or a partial match.The
Place
object contains what is considered a major value, which is the most specific portion of the place, and a minor value which is supplemental data that enhances the major value. Thekeyordlist
tool displays a place keyword by showing the major value, and then appending the minor value in parenthesis.If a user is searching for a place keyword, in code, it should ideally be handled by creating a
Place
object, and then comparing the values between the search parameter object and the target object being searched.The three entries below are essentially identical, but not exactly identical.
All 3 are referencing the same place, but the minor details are either missing or different; but they all reference the same place. A match between these values could be considered a
possible_exact_match
, whereas if all 3 records were 100% identical, they would be considered anexact_match
.However, if the search parameter was for a more wider scope, like "Los Angeles", consider the following values
The first three entries could be considered a
related
match, since they are locations within "Los Angeles", and they might be considered relevant (but they do not meet the definition of a match). The next three entries are essentially referencing the same city, expressed in 3 different forms (which would bepossible_exact_match
), and the 7th match could be considered astring_pattern_match
.Technically, the 5th and 6th entires in the above example should be considered a stronger match to each other as long as the code understands that CA is the abbreviation for California.
The searching algorithm return matches in the following order.
Also, most of the location data is expressed as string values. Considering that XML elements like
st
andcn
now support attribute values to handle abbreviation helpers, states and countries should probably be promoted to actual Python objects. (That work could be handled in a different ticket).This enhancement does not require a proof of concept command line tool to facilitate search operations for the end user; but it could be handled in a different ticket.
This work isn't necessarily related to https://github.com/cjcodeproj/medialibrary/issues/153. This ticket is probably a blocker for the Settings comparison work.