Closed dhimmel closed 6 years ago
Thank you for your suggestion.
Answering to your question:
isbnlib.metadata
returns a dictionary with keys ('ISBN-13', 'Title', 'Authors', 'Publisher', 'Year', 'Language') and values as strings (a list of strings for the 'Authors').
These are the common fields to all providers and are fixed in the library. Even then, 'Language' is NOT used with the builtin 'bibformatters' because for bibliographic citations 'Language' is the language in wich the book is written, but that is NOT the meaning of 'Language' in ISBN regestries (is usually the main language of the publisher's country)!
I will take a look at this CSL format and see if it make sense to install it in the core library as a new block in isbnlib/dev/_fmt.py
(probably yes if it is widely used) or as an add-in.
But please, you are free to have a go!
From a rush consultation to csl-json, it seems that in order to implemente a formatting in CSL is only necessary to create a new template in isbnlib\dev\_fmt.py
like:
csl = r'''{"type":"book", "id":"$ISBN", "title":"$Title", "issued": {"raw": "$Year"}, "ISBN":"$ISBN", "publisher":"$Publisher", "author": [$AUTHORS]}'''
with pos-processing for $AUTHORS
elif name == 'csl': AUTHORS = ', '.join('{"literal": "$"}'.replace("$", a) for a in authors)
Is this a correct CSL-JSON data fragment and is enough?
Agree with the general strategy. A few points / questions:
You probably want to set date_parts
rather than raw
for issued. It'd be like {'date_parts': [[year]]}
. It's an odd format, but that's what most styles will implement.
I'm not a fan of hardcoding the JSON structure. Instead, I'd construct a dict/OrderedDict and then json.dumps
it. For example, will the implementation above escape problematic characters in the fields?
Are all values guaranteed to be populated? If not, it's better to omit that key-value pair entirely, rather than have a blank value.
Setting URL
may also be nice... is there a way to get a URL for an ISBN?
Here is my reply point-by-point:
date_parts
its OK, especially if that is what most styles implement.isbnlib.metadata
as you which. Some cleaning is done in the data to avoid some obvious problematic cases, but no validation is attempted.URL
is very problematic in order to get consistency... by default the metadata for each ISBN is obtained from several providers! Some don't provide an URL
and in some cases the URL
is not deterministic... it depends on the region and the user (e.g. Google Books)!But maybe it is not a good idea to implement this in the core of isbnlib
, but do a plug-in because:
isbnlib
already supports a general purpose BibJSON format and with some simple pos-processing you can get CSL-JSON from it.Anyway, I have already implemented a 'simple' version to support 'CSL-JSON'! It produces things like this:
{"type":"book",
"id":"9780321534965",
"title":"The Art Of Computer Programming",
"author": [{"literal": "Donald Ervin Knuth"}],
"issued": {"date_parts": [["2008"]]},
"ISBN":"9780321534965",
"publisher":"Addison-Wesley"}
Is this a valid CSL document? Is this useful?
@xlcnd that would be useful. If you open a PR, I'd be happy to review. The only issue that I see presently is that 2008 should not be quoted. It should be an int.
Its already in the dev branch. Year is now an int.
It would be nice to have a bibformatter to export ISBN metadata to Citation Styles Language (CSL) JSON. This would help us add support for ISBN citations in the Manubot: see https://github.com/greenelab/manubot/issues/14.
I'm envisioning being able to do the following:
csl
would presumably adict
orcollections.OrderedDict
. Alternatively, it could be already dumped as a JSON string (although I think that's less preferable).CSL JSON is a way of storing bibliographic metadata that is a successor to formats like bibtex. It's used commonly in scholarly publishing. The documentation isn't great, but here's a schema definition. Here's also some written doc.
I'm happy to help as needed. Especially I can help convert the output of
isbnlib.meta
to CSL JSON. Is there documentation of all the possible keys returned in the output ofisbnlib.meta
?