cverluise / PatCit

Making Patent Citations Uncool Again
https://cverluise.github.io/PatCit/
MIT License
107 stars 13 forks source link

Zotero gzipped file is corrupt #54

Closed verginer closed 1 year ago

verginer commented 1 year ago

How to reproduce the behaviour

Download intext_patent_csv.tar from Zotero

Your Environment

The 5th file is corrupt.

$ gzip -tv data-release/*.gz
data-release/intext_patent_000000000000.csv.gz:   OK
data-release/intext_patent_000000000001.csv.gz:   OK
data-release/intext_patent_000000000002.csv.gz:   OK
data-release/intext_patent_000000000003.csv.gz:   OK
data-release/intext_patent_000000000004.csv.gz:   OK
gzip: data-release/intext_patent_000000000005.csv.gz: unexpected end of file
gzip: data-release/intext_patent_000000000005.csv.gz: uncompress failed
data-release/intext_patent_000000000005.csv.gz:   NOT OK
verginer commented 1 year ago

Sorry for this issue, the first download with the correct md5 hash: 93641563e5563ba7c80f676145965086 had a broken 005, downloading it again with the same md5 hash fixed it. I am not sure how that is possible. I close the issue. Sorry for the alarm.