relaton / relaton-iso

RelatonIso: ISO Standards metadata using the BibliographicItem model
BSD 2-Clause "Simplified" License
2 stars 1 forks source link

(URGENT) Titles missing for many many documents!! #166

Closed ronaldtse closed 3 months ago

ronaldtse commented 3 months ago
bundle exec relaton fetch "ISO 19135-1:2015"
[relaton-iso] (ISO 19135-1:2015) Fetching from Relaton repository ...
[relaton-iso] (ISO 19135-1:2015) Found: `ISO 19135-1:2015`
<bibdata type="standard" schema-version="v1.2.8">
  <fetched>2024-05-29</fetched>
  <title type="main" format="text/plain" language="en" script="Latn"/>
  <title type="main" format="text/plain" language="fr" script="Latn"/>
  <uri type="src">https://www.iso.org/standard/54721.html</uri>
ronaldtse commented 3 months ago

Reported here: https://github.com/Spatial-Web-Foundation/SWF-Corpus_and_IEEEP2874-D2/pull/1543

ReesePlews commented 3 months ago

@ronaldtse thanks to you and the team for fixing this. most of the missing entries are back now, but i am seeing these errors..

2024-05-29T10:35:56.3581218Z [relaton-doi] (doi:10.1111/ips.12026_1) Found: `10.1111/ips.12026_1`
2024-05-29T10:36:56.6895430Z [relaton] ERROR: `doi:10.1111/ips.12026_1` -- unexpected token at '<html>
2024-05-29T10:36:56.6896960Z <head><title>504 Gateway Time-out</title></head>
2024-05-29T10:36:56.6897851Z <body>
2024-05-29T10:36:56.6898681Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-29T10:36:56.6899410Z </body>
2024-05-29T10:36:56.6899740Z </html>
2024-05-29T10:36:56.6900290Z '
2024-05-29T10:36:56.6921935Z [relaton-doi] (doi:10.1145/3425898.3426958) Fetching from search.crossref.org ...
2024-05-29T10:36:57.0548822Z [relaton-doi] (doi:10.1145/3425898.3426958) Found: `10.1145/3425898.3426958`
2024-05-29T10:37:57.3898616Z [relaton] ERROR: `doi:10.1145/3425898.3426958` -- unexpected token at '<html>
2024-05-29T10:37:57.3900384Z <head><title>504 Gateway Time-out</title></head>
2024-05-29T10:37:57.3901362Z <body>
2024-05-29T10:37:57.3913057Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-29T10:37:57.3922231Z </body>
2024-05-29T10:37:57.3922674Z </html>
2024-05-29T10:37:57.3923298Z '
2024-05-29T10:37:57.3928301Z [relaton-doi] (doi:10.1111/psj.12212) Fetching from search.crossref.org ...
2024-05-29T10:37:57.6616144Z [relaton-doi] (doi:10.1111/psj.12212) Found: `10.1111/psj.12212`
2024-05-29T10:38:57.9348138Z [relaton] ERROR: `doi:10.1111/psj.12212` -- unexpected token at '<html>
2024-05-29T10:38:57.9349913Z <head><title>504 Gateway Time-out</title></head>
2024-05-29T10:38:57.9350869Z <body>
2024-05-29T10:38:57.9351561Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-29T10:38:57.9352370Z </body>
2024-05-29T10:38:57.9352904Z </html>
2024-05-29T10:38:57.9353446Z '
2024-05-29T10:38:57.9366462Z [relaton-doi] (doi:10.1177/26339137231222481) Fetching from search.crossref.org ...
2024-05-29T10:38:58.2818099Z [relaton-doi] (doi:10.1177/26339137231222481) Found: `10.1177/26339137231222481`
2024-05-29T10:39:58.5213769Z [relaton] ERROR: `doi:10.1177/26339137231222481` -- unexpected token at '<html>
2024-05-29T10:39:58.5215503Z <head><title>504 Gateway Time-out</title></head>
2024-05-29T10:39:58.5227683Z <body>
2024-05-29T10:39:58.5235869Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-29T10:39:58.5236511Z </body>
2024-05-29T10:39:58.5236863Z </html>
2024-05-29T10:39:58.5237260Z '
2024-05-29T10:39:58.5243317Z [relaton-doi] (doi:10.1007/978-3-031-01584-7) Fetching from search.crossref.org ...
2024-05-29T10:39:58.8042338Z [relaton-doi] (doi:10.1007/978-3-031-01584-7) Found: `10.1007/978-3-031-01584-7`
2024-05-29T10:40:59.0569368Z [relaton] ERROR: `doi:10.1007/978-3-031-01584-7` -- unexpected token at '<html>
2024-05-29T10:40:59.0571118Z <head><title>504 Gateway Time-out</title></head>
2024-05-29T10:40:59.0572122Z <body>
2024-05-29T10:40:59.0573657Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-29T10:40:59.0601798Z </body>
2024-05-29T10:40:59.0602648Z </html>
2024-05-29T10:40:59.0603126Z '
2024-05-29T10:40:59.0604001Z [relaton-doi] (doi:10.1080/19460171.2014.957056) Fetching from search.crossref.org ...
2024-05-29T10:40:59.3984353Z [relaton-doi] (doi:10.1080/19460171.2014.957056) Found: `10.1080/19460171.2014.957056`
2024-05-29T10:41:59.5764818Z [relaton] ERROR: `doi:10.1080/19460171.2014.957056` -- Net::ReadTimeout with #<TCPSocket:(closed)>
2024-05-29T10:41:59.5779273Z [relaton-doi] (doi:10.1038/s41893-021-00707-5) Fetching from search.crossref.org ...
2024-05-29T10:42:24.1926765Z [relaton-doi] (doi:10.1038/s41893-021-00707-5) Found: `10.1038/s41893-021-00707-5`
2024-05-29T10:43:24.4410409Z [relaton] ERROR: `doi:10.1038/s41893-021-00707-5` -- Net::ReadTimeout with #<TCPSocket:(closed)>
2024-05-29T10:43:24.4425336Z [relaton-doi] (doi:10.1111/j.1467-8306.2004.09402005.x) Fetching from search.crossref.org ...
2024-05-29T10:43:24.6566904Z [relaton-doi] (doi:10.1111/j.1467-8306.2004.09402005.x) Found: `10.1111/j.1467-8306.2004.09402005.x`
2024-05-29T10:44:24.8565285Z [relaton] ERROR: `doi:10.1111/j.1467-8306.2004.09402005.x` -- Net::ReadTimeout with #<TCPSocket:(closed)>
2024-05-29T10:44:24.8580515Z [relaton-doi] (doi:10.1257/aer.100.3.641) Fetching from search.crossref.org ...
2024-05-29T10:44:25.1113772Z [relaton-doi] (doi:10.1257/aer.100.3.641) Found: `10.1257/aer.100.3.641`
2024-05-29T10:45:25.3441072Z [relaton] ERROR: `doi:10.1257/aer.100.3.641` -- Net::ReadTimeout with #<TCPSocket:(closed)>
2024-05-29T10:45:25.3443552Z [relaton-nist] (NIST SP 1270) Fetching from csrc.nist.gov ...
2024-05-29T10:45:25.3690112Z [relaton-nist] (NIST SP 1270) Fetching from Relaton repository ...
2024-05-29T10:45:25.6326214Z [relaton-nist] (NIST SP 1270) Found: `NIST SP 1270`
2024-05-29T10:45:25.6665372Z [relaton-doi] (doi:10.1111/j.1758-5899.2011.00156.x) Fetching from search.crossref.org ...
2024-05-29T10:45:25.8836063Z [relaton-doi] (doi:10.1111/j.1758-5899.2011.00156.x) Found: `10.1111/j.1758-5899.2011.00156.x`
2024-05-29T10:46:26.1740570Z [relaton] ERROR: `doi:10.1111/j.1758-5899.2011.00156.x` -- Net::ReadTimeout with #<TCPSocket:(closed)>

i think they correspond to this render error

image

could someone please check this. if you need the bib file, let me know.

andrew2net commented 3 months ago

2024-05-29T10:36:56.6896960Z 504 Gateway Time-out

@ReesePlews this message says that the DOI server had an issue. I tried these DOI references and they work for me. Please try them again.

ReesePlews commented 3 months ago

hello @andrew2net i have rerun the document generate today but the issue still persists however the number of unreachable entries seems to have decreased. i wonder why they are not reaching the server?

2024-05-30T11:34:58.3096511Z [relaton] ERROR: `doi:10.1111/ips.12026_1` -- unexpected token at '<html>
2024-05-30T11:34:58.3098081Z <head><title>504 Gateway Time-out</title></head>
2024-05-30T11:34:58.3126326Z <body>
2024-05-30T11:34:58.3126975Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-30T11:34:58.3127597Z </body>
2024-05-30T11:34:58.3127967Z </html>
2024-05-30T11:34:58.3128356Z '
2024-05-30T11:34:58.3128851Z [relaton-doi] (doi:10.1145/3425898.3426958) Fetching from search.crossref.org ...
2024-05-30T11:35:05.0569180Z [relaton-doi] (doi:10.1145/3425898.3426958) Found: `10.1145/3425898.3426958`
2024-05-30T11:36:05.1556862Z [relaton] ERROR: `doi:10.1145/3425898.3426958` -- unexpected token at '<html>
2024-05-30T11:36:05.1558747Z <head><title>504 Gateway Time-out</title></head>
2024-05-30T11:36:05.1608365Z <body>
2024-05-30T11:36:05.1609054Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-30T11:36:05.1609715Z </body>
2024-05-30T11:36:05.1610110Z </html>
2024-05-30T11:36:05.1610536Z '
2024-05-30T11:36:05.1611341Z [relaton-doi] (doi:10.1111/psj.12212) Fetching from search.crossref.org ...
2024-05-30T11:36:13.1994063Z [relaton-doi] (doi:10.1111/psj.12212) Found: `10.1111/psj.12212`
2024-05-30T11:37:13.3203786Z [relaton] ERROR: `doi:10.1111/psj.12212` -- unexpected token at '<html>
2024-05-30T11:37:13.3205349Z <head><title>504 Gateway Time-out</title></head>
2024-05-30T11:37:13.3234983Z <body>
2024-05-30T11:37:13.3235804Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-30T11:37:13.3236458Z </body>
2024-05-30T11:37:13.3236845Z </html>
2024-05-30T11:37:13.3237467Z '
2024-05-30T11:37:13.3238094Z [relaton-doi] (doi:10.1177/26339137231222481) Fetching from search.crossref.org ...
2024-05-30T11:37:15.4186069Z [relaton-doi] (doi:10.1177/26339137231222481) Found: `10.1177/26339137231222481`
2024-05-30T11:38:15.6547511Z [relaton] ERROR: `doi:10.1177/26339137231222481` -- unexpected token at '<html>
2024-05-30T11:38:15.6548882Z <head><title>504 Gateway Time-out</title></head>
2024-05-30T11:38:15.6549745Z <body>
2024-05-30T11:38:15.6550454Z <center><h1>504 Gateway Time-out</h1></center>
2024-05-30T11:38:15.6551222Z </body>
2024-05-30T11:38:15.6551756Z </html>
2024-05-30T11:38:15.6552373Z '
2024-05-30T11:38:15.6553372Z [relaton-doi] (doi:10.1007/978-3-031-01584-7) Fetching from search.crossref.org ...
2024-05-30T11:38:27.6612589Z [relaton-doi] (doi:10.1007/978-3-031-01584-7) Found: `10.1007/978-3-031-01584-7`
2024-05-30T11:39:21.7247801Z [relaton-doi] (doi:10.1080/19460171.2014.957056) Fetching from search.crossref.org ...
2024-05-30T11:39:22.3553079Z [relaton-doi] (doi:10.1080/19460171.2014.957056) Found: `10.1080/19460171.2014.957056`
2024-05-30T11:40:20.9806824Z [relaton-doi] (doi:10.1038/s41893-021-00707-5) Fetching from search.crossref.org ...
2024-05-30T11:40:40.8723479Z [relaton-doi] (doi:10.1038/s41893-021-00707-5) Found: `10.1038/s41893-021-00707-5`
2024-05-30T11:41:41.1328409Z [relaton] ERROR: `doi:10.1038/s41893-021-00707-5` -- Net::ReadTimeout with #<TCPSocket:(closed)>

when they are missing the bibliography the errors are populated throughout the document. please advise how to deal with this. thank you.
andrew2net commented 3 months ago

@ReesePlews what happens if you try a command like relaton fetch doi:10.1111/ips.12026_1 immediately after you get the error? Just replace the DOI identifier with one that causes the error. I suspect the API we use has a rate constraint, so we need to update our script to handle this issue.

UPD moved the issue to relaton-doi

ReesePlews commented 3 months ago

@andrew2net thank you for checking into this.

i am not directly using relaton, only placing the statements into an asciidoc file for building a bibliography in a standard which is run on github. i can run the relaton command you suggested from my local machine. i ran three times on the command line and all three returned data without an error.

i see you have made some modification, will that take care of this issue? will this be in the next release of metanorma?

i dont know how to make a change when my document code is being generated on the github system. if i need to structure the asciidoc file differently i can do that if you provide a bit of advice. thank you.

andrew2net commented 3 months ago

@ReesePlews I understand that you are using Metanorma. Metanorma uses Relaton to fetch many documents in parallel. If you don't have the error fetching a single document, then the error is induced by the API's rate limit.

I closed this issue because the problem with ISO titles was solved. The fetching error occurs in another gem relaton-doi, so I created a new issue here https://github.com/relaton/relaton-doi/issues/18. Let's continue the discussion on the new issue.