Closed dkg closed 3 years ago
note: when i run this against data.bib
in the the-encryption-compendium.github.io
repo, i get the following outcome:
duplicate IDs: {'obama_coalition_2016', 'noauthor_considerations_2016'}
The obama_coalition_2016
entries look like this:
@misc{obama_coalition_2016,
title = {Coalition {Letter} to {President} {Obama} 04/11/2016},
url = {https://www.accessnow.org/cms/assets/uploads/2016/04/Encryption-Letter.pdf},
language = {English},
collaborator = {Obama, Barack},
month = apr,
year = {2016},
keywords = {2010s, Backdoors, Compliance with Court Orders Act},
}
[…]
@misc{obama_coalition_2016,
title = {Coalition {Letter} to {President} {Obama} 10/27/2016},
collaborator = {Obama, Barack},
month = oct,
year = {2016},
keywords = {2010s, Access Now, Apple, Backdoors, EFF, International},
}
These seem to be distinct documents, though there is so little data about the second one i don't know what it is. I think it's this followup to the https://savecrypto.org/ petition.
and the noauthor_considerations_2016
entries look like this:
@misc{noauthor_considerations_2016,
title = {Considerations for {Encryption} in {Public} {Safety} {Radio} {Systems}},
publisher = {Federal Partnership for Interoperable Communications},
month = sep,
year = {2016},
keywords = {2010s, Public Safety Radios},
}
[…]
@misc{noauthor_considerations_2016,
title = {Considerations for {Encryption} in {Public} {Safety} {Radio} {Systems}},
url = {https://www.dhs.gov/sites/default/files/publications/20160830_fact_sheet_considerations_final_draft508_0.pdf},
language = {en},
publisher = {Federal Partnership for Interoperable Communications},
month = sep,
year = {2016},
note = {Two-pager explanatory document},
keywords = {2010s, Department of Homeland Security, Encryption Standards, Public Safety, Public Safety Radios},
}
I think this pair is a duplicate, and the first one should be deleted from the file.
Thanks for making this! Sorry it took a bit to get around to looking at it. It looks good to me though. I've just deleted the duplicate you found from Zotero, so that should hopefully no longer be an issue.
This is a simple test -- any duplicate bibtex ID will cause all but one of the entries to be skipped in entries_dict due to the collision.
The goal is to abort the site build if
data.bib
has such a duplicate entry.This is still pretty weak: it doesn't test whether
data.bib
is valid bibtex, it doesn't verify anything else about the entries, but it also offers a place to do more in-depth consistency/cleanliness checks on the data scraped from zotero.Addresses (but does not close) #46.