metanorma / pubid-core

BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Ignore some attributes when compare IDs #21

Closed andrew2net closed 1 year ago

andrew2net commented 1 year ago

Sometimes we need to find last edition, release, or year of publication. Sometimes we need to select all parts collection. For example the method pubid_ref == pubid returns false if pubid_ref doesn't have edition but pubid does. We need a way to ignore some attributes if they undefined in pubid_ref. For example we can pass a list of attributes that should be ignored:

pubid_ref == pubid, [:edition, :year]
mico commented 1 year ago

@andrew2net how about we add an option to only compare elements which are defined? So we don't need to use list of attributes to ignore?

Something like: pubid_ref.eql?(pubid, only_defined: true)

andrew2net commented 1 year ago

@mico some undefined elements should be used in comparing. We have various cases across relaton-* gems when some elements should be ignored in comparing. So we need a way to exclude some elements from comparing but not all undefined elements.

mico commented 1 year ago

@mico some undefined elements should be used in comparing. We have various cases across relaton-* gems when some elements should be ignored in comparing. So we need a way to exclude some elements from comparing but not all undefined elements.

@andrew2net could you provide more details about cases when we need to exclude some elements but not all undefined?

andrew2net commented 1 year ago

We have many cases in relaton-* gem and the requirements are still changing. So we need some flexibility in IDs' comparing. Some cases are:

mico commented 1 year ago

We have many cases in relaton-* gem and the requirements are still changing. So we need some flexibility in IDs' comparing.

Summarizing cases, we need functionality to:

Did I miss something?

What is value in relaton-ecma?

Looking at that, I agree that excluding list could solve the problem we have. Also, I see here you need to fetch the latest year or edition sometimes, so we can add option for fetching latest version.

andrew2net commented 1 year ago

We have to scenarios in Relaton, fetching from Relaton repository and fetching from websites.

Relaton repositories have indexes, YAML files with ID => filename structure. Most indexes have ID as a string, but they are going to be a Hash with Pubid#to_h content (that what we need to_h function). So with IDs as Hash we can create a document Pubid (can we?) and compare it agains a reference Pubid. We need to select a list of document Pubids that match to all the attributes presented in the reference Pubid, ignoring attributes that omitted and listed in an ignore list (attr.nil? && ignore_list.include?(attr)). Next we can find the ID with the biggest attribute number, or create all_parts combined document, or do whatever we need.

Websites search engines return list of document IDs as strings. We need to parse these IDs and compare document Pubids with reference Pubid in same way as described above.

andrew2net commented 1 year ago

Summarizing cases, we need functionality to:

  • return latest year
  • return identifiers for all years
  • return latest edition
  • return identifiers for all edition
  • return identifiers for all edition month's (ignore month, but apply year)
  • return documents with any types and stages
  • return documents with any stages
  • return documents with any parts
  • return documents with any revisions
  • return documents with any amendment's edition (year)
  • return documents with any amendments

Did I miss something?

These are parts of Relaton's internal functionality. We don't heed to hardcode each case in the Pubid. Cases can be changed later. So, in Relaton we need:

What is value in relaton-ecma?

It's volume, sorry for mistake. Here is a part of ECMA index:

...
- :id:
    :id: ECMA-269
    :ed: '2'
  :file: data/ECMA-269-2.yaml
- :id:
    :id: ECMA-269
    :ed: '3'
    :vol: '1'
  :file: data/ECMA-269-3-1.yaml
- :id:
    :id: ECMA-269
    :ed: '3'
    :vol: '2'
  :file: data/ECMA-269-3-2.yaml
...

With ref ECMA-269 all 3 entries selected. With ref ECMA-269-2 only the first entry selected. With ref ECMA-269-3 last 2 entries selected.