relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

Sorting through entries returned from isobib #25

Closed opoudjis closed 6 years ago

opoudjis commented 6 years ago

Isobib returns a big list of entries; the first entry found is not necessarily the right entry to return, and both @andrew2net and I have several routines in asciidoctor-iso that sort through the returned isobib entries, (a) confirming that the right document code has been returned; (b) trying to match on year; (c) ignoring matches with no titles (e.g. https://www.iso.org/standard/15905.html).

This code should move into isobib, it shouldn't be the client's job to sift through all match hits on the ISO website. I'll move the code into isobib once it's stabilised in asciidoctor-iso, I'm still working on it.

FYI, this is the code so far:

      def fetch_year_check(hit, code, year, opts)
        ret =nil
        if year.nil? || year.to_i == hit.hit["year"]
          ret = hit.to_xml opts
          @bibliodb[code] = ret
        else
          warn "WARNING: cited year #{year} does not match year "\
            "#{hit.hit['year']} found on the ISO website for #{code}"
        end
        ret
      end

      def first_with_title(result)
        result.first.each do |x|
          next unless x.hit["title"]
          return x
        end
        return nil
      end

      def first_year_match_hit(result, code, year)
        return first_with_title(result) if year.nil?
        return nil unless result.first && result.first.is_a?(Array)
        coderegex = %r{^(ISO|IEC)[^0-9]*\s[0-9-]+}
        result.first.each do |x|
          next unless x.hit["title"]
          return x if x.hit["title"]&.match(coderegex)&.to_s == code &&
            year.to_i == x.hit["year"]
        end
        return first_with_title(result)
      end

      def fetch_ref1(code, year, opts)
        return @bibliodb[code] if @bibliodb[code]
        result = Isobib::IsoBibliography.search(code)
        ret = nil
        hit = first_year_match_hit(result, code, year)
        coderegex = %r{^(ISO|IEC)[^0-9]*\s[0-9-]+}
        if hit && hit.hit["title"]&.match(coderegex)&.to_s == code
          ret = fetch_year_check(hit, code, year, opts)
        else
          warn "WARNING: no match found on the ISO website for #{code}"
        end
        ret
      end
andrew2net commented 6 years ago

@opoudjis could we close this issue?

opoudjis commented 6 years ago

No, because I want to check what you've done against what filtering I already have in place. I'll be doing that next week.

andrew2net commented 6 years ago

@opoudjis I don't understand what I should do with this issue. But want to mention that hit.hit["year"] isn't the year of publishing. The hit hash is returned by Algoliasearch service. I don't know what is hit.hit["year"]. Suppose we can extract publishing year from hit.hit["title"].

opoudjis commented 6 years ago

It's a request to refactor isobib to do filtering by year; but there's nothing to be done until I have filtering code in place in asciidoctor-iso that works.

opoudjis commented 6 years ago

Obsoleted by code included in #24