relaton / relaton-bipm

MIT License
2 stars 0 forks source link

(URGENT) Crash on incorrect Metrologia page number, captcha blocking should give warning #31

Closed ronaldtse closed 1 year ago

ronaldtse commented 1 year ago
$ bundle exec relaton fetch "BIPM Metrologia 34 3 9"
[relaton-bipm] ("BIPM Metrologia 34 3 9") fetching...
bundler: failed to load command: relaton (/Users/mulgogi/.asdf/installs/ruby/3.1.2/bin/relaton)
gems/3.1.0/gems/relaton-bipm-1.13.0/lib/relaton_bipm/bipm_bibliography.rb:223:in `get_article_from_issue': undefined method `[]' for nil:NilClass (NoMethodError)

        get_article rsp.at("//div[@class='indexer'][.='#{art}']/../div/a")[:href], vol, ish, agent
                                                                          ^^^^^^^
    from gems/3.1.0/gems/relaton-bipm-1.13.0/lib/relaton_bipm/bipm_bibliography.rb:78:in `get_metrologia'

In no case should the Relaton fetch crash. It should instead point the user to find out what is the correct options/formats and ways to resolve this.

From the README:

$ bundle exec relaton fetch "BIPM Metrologia 29 6 373"
[relaton-bipm] ("BIPM Metrologia 29 6 373") fetching...
[relaton-bipm] https://iopscience.iop.org/issue/0026-1394/29/6 is redirected to https://hcvalidate.perfdrive.com/?ssa=f2604e7a-58f7-496e-b876-d1a3904429a0&ssb=28176286894&ssc=https%3A%2F%2Fiopscience.iop.org%2Fissue%2F0026-1394%2F29%2F6&ssi=a6758d39-8427-42a7-88e3-56319791a6e8&ssk=support@shieldsquare.com&ssm=41782385357657602108803376014196&ssn=f9933558dbe91f381fba1dfb1a185734aab934bed76f-0c93-4aa2-a22ddb&sso=00407723-ad3ab8fa2b28ca5c25a65529bac7c07aabac4ebc7d3a4115&ssp=45956966461661830780166188508531392&ssq=57287423354462992882933544704900734576778&ssr=MjE4LjE4OC40My4yNTQ=&sst=Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1&ssv=&ssw=&ssx=W10=
redirected to https://hcvalidate.perfdrive.com/?ssa=f2604e7a-58f7-496e-b876-d1a3904429a0&ssb=28176286894&ssc=https%3A%2F%2Fiopscience.iop.org%2Fissue%2F0026-1394%2F29%2F6&ssi=a6758d39-8427-42a7-88e3-56319791a6e8&ssk=support@shieldsquare.com&ssm=41782385357657602108803376014196&ssn=f9933558dbe91f381fba1dfb1a185734aab934bed76f-0c93-4aa2-a22ddb&sso=00407723-ad3ab8fa2b28ca5c25a65529bac7c07aabac4ebc7d3a4115&ssp=45956966461661830780166188508531392&ssq=57287423354462992882933544704900734576778&ssr=MjE4LjE4OC40My4yNTQ=&sst=Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1&ssv=&ssw=&ssx=W10=

This happens because IOP now uses a DDOS captcha service.

I had to manually go to the given link (https://iopscience.iop.org/issue/0026-1394/29/6), resolve the captcha, then re-run the command, to get information.

If we get this link, we should ask the user to manually go to that link to resolve the captcha:

This source employs anti-DDoS measures that unfortunately affects automated requests.
Please visit this link in your browser to resolve the CAPTCHA, then retry:
  https://iopscience.iop.org/issue/0026-1394/29/6
$ bundle exec relaton fetch "BIPM Metrologia 29 6 373"
[relaton-bipm] ("BIPM Metrologia 29 6 373") fetching...
[relaton-bipm] ("BIPM Metrologia 29 6 373") found Metrologia 29 6 373
<bibdata type="standard">
...

And I finally found out why the first command won't work, because the 3rd parameter is the PAGE NUMBER, not the "article number" as stated in the README (we need to update the README):

$ bundle exec relaton fetch "BIPM Metrologia 34 3 261"
[relaton-bipm] ("BIPM Metrologia 34 3 261") fetching...
[relaton-bipm] ("BIPM Metrologia 34 3 261") found Metrologia 34 3 261
<bibdata type="standard">
  <fetched>2022-08-30</fetched>
...
ronaldtse commented 1 year ago

Confirm fixed. Thanks!