Closed rgaudin closed 1 year ago
We've decided that in all of scraperlib's current use cases, we know all metadata before starting the Creator.
So we'll raise an exception if any mandatory metadata is not set at start()
time.
We can revisit this behavior in the future.
We could maybe have an extra method that sets empty or random data for debug/test
I would like to work on this, but I'm kind of confused about API designs. Does it mean that we need to add a determination in the code below that determines that all the mandatory metadata is not null?https://github.com/openzim/python-scraperlib/blob/e67f0145c5879708508a34a784f086d741fdbdd0/src/zimscraperlib/zim/creator.py#L117-L126
Yes. We also need to keep a reference to all metadata that has been set (overloading add_metadata
for instance)
As I was trying to solve this problem, I realized that If we want to solve this problem and several other issues(#87, #94) we need to adjust the design of the API of Creator
.
Maybe we can introduce a class named Metadata
to do jobs like:
Language
:ISO639-3
, Date
:ISO - YYYY-MM-DD
)And I'm not sure which level should we add it, python-libzim
or python-scraperlib
or libzim
.
And I'm not sure which level should we add it,
python-libzim
orpython-scraperlib
orlibzim
.
scraperlib
On implementation, keep in mind that what matters most is comfort for people writing scrapers and/or creating ZIMs via Python.
You can pick another ticket if this one is not defined enough. It's actually mostly about API design and should not be considered a good first issue.
See https://github.com/openzim/python-scraperlib/pull/96 comments for updated description
https://github.com/openzim/libzim/issues/785 Might be of interest for this ticket.
Rah we forgot to attach this to the second PR. This has been implemented in 3.0.0.
Thanks for the pointer ; we'll see if and how it gets implemented but chances are we keep a separate logic because we have a lot more flexibility (we test image format and size for instance) and we go beyond the spec by applying the kiwix recommendations
Our Creator wrapper should make sure that we do supply valid (albeit possibly useless) values for the mandatory metadata required by the spec.
This would mean:
Title
,Description
,Creator
,Publisher
TODAY()
) value forDate
Name
eng
value forLanguage
Illustration_48x48@1