relaton / relaton-nist

NistBib: retrieve NIST Standards for bibliographic use using the BibliographicItem model
https://www.metanorma.com
MIT License
2 stars 1 forks source link

URGENT: Errors/crash in fetching several NIST SP documents #86

Closed ronaldtse closed 1 year ago

ronaldtse commented 1 year ago

From SP 800-53r5:

[relaton-nist] ("NIST SP 800-50") fetching...
[relaton] ERROR: NIST SP 800-50 -- zlib error while inflating
[relaton-nist] ("NIST SP 800-12") found SP 800-12
[relaton-nist] ("NIST SP 800-18") found SP 800-18 Rev. 1
[relaton-nist] ("NIST SP 800-53A") fetching...
[relaton-nist] ("NIST SP 800-53B") fetching...
[relaton-nist] ("NIST SP 800-53A") found SP 800-53A Rev. 5
[relaton-nist] ("NIST FIPS 202") found FIPS 202
[relaton-nist] ("NIST SP 800-55") fetching...
[relaton-nist] ("NIST FIPS 140-3") found FIPS 140-3
[relaton-nist] ("NIST SP 800-53B") found SP 800-53B
[relaton-nist] ("NIST SP 800-55") found SP 800-55 Rev. 1
[relaton-nist] ("NIST SP 800-56B") fetching...
[relaton-nist] ("NIST SP 800-56C") fetching...
[relaton-nist] ("NIST SP 800-57-1") fetching...
[relaton-nist] ("NIST SP 800-56A") fetching...
[relaton-nist] ("NIST SP 800-56A") found SP 800-56A Rev. 3
[relaton-nist] ("NIST SP 800-56B") found SP 800-56B Rev. 2
[relaton-nist] ("NIST SP 800-56C") found SP 800-56C Rev. 2
[relaton-nist] ("NIST SP 800-57-2") fetching...
[relaton-nist] ("NIST SP 800-57-3") fetching...
[relaton-nist] ("NIST SP 800-52") fetching...
[relaton-nist] ("NIST SP 800-60-2") fetching...
[relaton-nist] ("NIST SP 800-60-1") fetching...
[relaton] ERROR: NIST SP 800-60-1 -- zlib error while inflating
[relaton] ERROR: NIST SP 800-52 -- zlib error while inflating
[relaton-nist] ("NIST SP 800-61") fetching...
[relaton-nist] ("NIST SP 800-63-3") fetching...
[relaton] ERROR: NIST FIPS 197 -- zlib error while inflating
[relaton] ERROR: NIST SP 800-63-3 -- Zip end of central directory signature not found
[relaton] ERROR: NIST FIPS 180-4 -- Zip end of central directory signature not found
[relaton] ERROR: NIST FIPS 201-2 -- 859: unexpected token at ''
[relaton-nist] ("NIST SP 800-63A") fetching...
[relaton-nist] ("NIST SP 800-70") fetching...
[relaton] ERROR: NIST SP 800-61 -- undefined method `bytesize' for nil:NilClass

      return if buf.bytesize == ::Zip::CDIR_ENTRY_STATIC_HEADER_LENGTH
                   ^^^^^^^^^

And

[relaton-nist] WARNING: no match found online for NIST SP 800-160-2. The code must be exactly like it is on the standards website.
[relaton-nist] The provided document part may not exist, or the document may no longer be published in parts.
[relaton-nist] WARNING: no match found online for NIST SP 800-160-1. The code must be exactly like it is on the standards website.
[relaton-nist] The provided document part may not exist, or the document may no longer be published in parts.
[relaton-nist] WARNING: no match found online for NIST SP 800-154. The code must be exactly like it is on the standards website.
[relaton-nist] The provided document part may not exist, or the document may no longer be published in parts.
[relaton-nist] WARNING: no match found online for NIST SP 800-188. The code must be exactly like it is on the standards website.
[relaton-nist] The provided document part may not exist, or the document may no longer be published in parts.

This is blocking https://github.com/metanorma/mn-samples-nist/pull/79

andrew2net commented 1 year ago

@ronaldtse we don't have SP 800-160-1 and SP 800-160-2 in our sources. May be the references should be SP 800-160v1 and SP 800-160v2? There aren't NIST SP 800-60-1, NIST SP 800-154, and NIST SP 800-188 but there are NIST SP 800-60v1, NIST SP 800-154 (PD), and NIST SP 800-188 (PD) in the https://csrc.nist.gov/CSRC/media/feeds/metanorma/pubs-export.zip source. Other references work for me. Do you use async fetching when you get the issue? If so, we seems having a bug with concurrent getting pub-export.zip How can I reproduce the issue on my local comp?

ronaldtse commented 1 year ago

@andrew2net wow you are correct here on all counts!

We will update those particular references (ping @manuelfuenmayor )

I am using Metanorma to fetch from Relaton. I don't know if the problem is using async fetching, but I know for sure we are fetching A LOT of references at once because there are a lot of refs in this doc. (see the document PR)

I think once we fetch pubs-export.zip once, we should respect the ETag and cache it properly.

ronaldtse commented 1 year ago

@andrew2net these are the remaining issues in this ticket:

[relaton] ERROR: NIST SP 800-50 -- zlib error while inflating
[relaton] ERROR: NIST FIPS 180-4 -- Zip end of central directory signature not found
[relaton] ERROR: NIST FIPS 201-2 -- 859: unexpected token at ''
[relaton] ERROR: NIST SP 800-61 -- undefined method `bytesize' for nil:NilClass

      return if buf.bytesize == ::Zip::CDIR_ENTRY_STATIC_HEADER_LENGTH
                   ^^^^^^^^^
andrew2net commented 1 year ago

I think once we fetch pubs-export.zip once, we should respect the ETag and cache it properly.

@ronaldtse this is correct but in the case of async fetch one thread may start downloading the pub-export.zip and another may start doing same task because the first one doesn't finish it. I can try to model the situation, but if you tell me how to reproduce the issue using Metanorma then it will be useful.

ronaldtse commented 1 year ago

The way to reproduce, assume there is no relaton cache:

$ cd mn-samples-nist
$ bundle
$ cd sources/800-53r5
$ bundle exec metanorma document.adoc
# ...

I think we should use a queue for the fetches to pubs-export.zip.

andrew2net commented 1 year ago

@ronaldtse I don't get any error with the mn-smples-nist. Try to remove the Gemfile.lock file and rub bundle again.

I think we should use a queue for the fetches to pubs-export.zip.

I didn't get the idea. We use queue for threads but it doesn't prevent the threads form causing to load the archive. I think we need to create a singleton class in the relaton-nist, that fetches pubs-export.zip file, and wrap the fetching method in a Mutex block. So the first thread that calls the fetching will blocks the fetching, and other threads that need to fetch the pubs-export.zip will have to wait until it loaded. Does it make sense?

andrew2net commented 1 year ago

Thread safe fetching pubs-export implemented in v 1.14.2

andrew2net commented 1 year ago

@ronaldtse can we close this issue?

ronaldtse commented 1 year ago

This issue is indeed fixed but I'm still seeing issues in that particular document. Will open new ticket.