relaton / relaton-itu

Relaton-ITU: retrieve ITU Standards for bibliographic use using the Relaton BibliographicItem model
https://www.relaton.org
MIT License
2 stars 1 forks source link

When ITU remote item is not available, Relaton crashes and cannot recover #61

Closed ronaldtse closed 2 years ago

ronaldtse commented 2 years ago

https://github.com/metanorma/metanorma-docker/runs/5715002378?check_suite_focus=true

#<Thread:0x00007fe759f94a50 /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/workers_pool.rb:10 run> terminated with exception (report_on_exception is true):
/usr/local/bundle/gems/mechanize-2.8.4/lib/mechanize/http/agent.rb:332:in `fetch': 503 => Net::HTTPServiceUnavailable for https://www.itu.int/dms_pubrec/itu-t/rec/g/T-REC-G.655-200911-I!!SUM-HTM-E.htm -- unhandled response (Mechanize::ResponseCodeError)
    from /usr/local/bundle/gems/mechanize-2.8.4/lib/mechanize/http/agent.rb:1007:in `response_redirect'
    from /usr/local/bundle/gems/mechanize-2.8.4/lib/mechanize/http/agent.rb:324:in `fetch'
    from /usr/local/bundle/gems/mechanize-2.8.4/lib/mechanize.rb:465:in `get'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/scrapper.rb:76:in `fetch_abstract'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/scrapper.rb:58:in `parse_page'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/hit.rb:11:in `fetch'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/itu_bibliography.rb:127:in `block in isobib_results_filter'
    from /usr/local/lib/ruby/3.1.0/forwardable.rb:238:in `each'
    from /usr/local/lib/ruby/3.1.0/forwardable.rb:238:in `each'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/itu_bibliography.rb:124:in `isobib_results_filter'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/itu_bibliography.rb:138:in `itubib_get1'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/itu_bibliography.rb:50:in `get'
    from /usr/local/bundle/gems/relaton-itu-1.10.1/lib/relaton_itu/processor.rb:17:in `get'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/db.rb:432:in `net_retry'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/db.rb:417:in `new_bib_entry'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/db.rb:397:in `check_bibliocache'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/db.rb:74:in `fetch'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/db.rb:110:in `block in fetch_async'
    from /usr/local/bundle/gems/relaton-1.10.3/lib/relaton/workers_pool.rb:11:in `block (2 levels) in initialize'

This job died after 6 hours after this crash in Relaton. I believe that right now if a Relaton parallel thread dies, the Thread never joins and therefore stalls forever.

andrew2net commented 2 years ago

@ronaldtse in this case something went wrong on the server-side when an abstract was fetched from the https://www.itu.int/dms_pubrec/itu-t/rec/g/T-REC-G.655-200911-I!!SUM-HTM-E.htm page. Now the page is available.

What can we do with the dead threads?

ronaldtse commented 2 years ago

What can we do with the dead threads?

  • Catch the error and warn. I this case fetched document will be without abstract.
  • Try to kill dead thread after timeout and raise error in main thread.

I think we need both. Relaton needs to tell the user is a particular document is not available (or is no longer available, e.g. deleted).

If anything goes wrong with the dead thread, it should be killed and reported back up the stack.

andrew2net commented 2 years ago

I think we need both. Relaton needs to tell the user is a particular document is not available (or is no longer available, e.g. deleted).

@ronaldtse relaton already handles errors when a document isn't available. In this situation, an abstract is fetched from a separated page using an additional HTTP request and the additional HTTP request fails. Should we handle the entire document as unavailable if it fails to fetch an abstract?

ronaldtse commented 2 years ago

@andrew2net the abstract is less important, so if we have enough information, we want to make it available to the user. But print out a warning message that fetching abstract failed for that item. Thanks!