Closed daniel-chou closed 4 years ago
Hey @daniel-chou sorry for the delay.
While looking into the metadata issue, we also decided to modify the tarball situation. Instead, we've sharded the metadata file into multiple files, one per batch of papers.
They can be found at ai2-s2-gorc-release/20190928/metadata/
labeled 0.tsv
, 1.tsv
, ..., to 10000.tsv
.
Hopefully this makes things easier!
@kyleclo Thanks for sharding the metadata file into multiple TSV files! This works well for me. 👍
Just to confirm the .tsv
files range from 0 to 9998; neither 9999.tsv
nor 10000.tsv
exists. Is this correct?
Ah, nice catch! Looks like forgot to copy 9999.tsv
over. It's added now. There is no 10000.tsv
. Thanks!
Great! I downloaded 9999.tsv
just now. Thank you for resolving this issue. 👍
The metadata file at the S3 path (
s3://ai2-s2-gorc-release/20190928/metadata.tar.gz
) appears to be in bzip2 format. Would you consider renaming the file? Perhapsmetadata.tar.bz2
would be a possibility?