inveniosoftware / dojson

Simple pythonic JSON to JSON converter.
https://dojson.readthedocs.io
Other
10 stars 29 forks source link

overdo: addition of liberal mode for datafield #61

Closed greut closed 8 years ago

greut commented 8 years ago

MARC XML document with unknown datafield tag.

<record>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Donges, Jonathan F</subfield>
  </datafield>
  <datafield tag="999" ind1=" " ind2=" ">
    <subfield code="a">I'm crazy field!</subfield>
  </datafield>
</record>

What's known is translated and the rest is kept as is:

{
  "main_entry_personal_name": {
    "personal_name": "Donges, Jonathan F"
  },
  "999__": {
    "a": "I'm crazy field!"
  }
}

Remaining problems, unknown subfields, e.g. RECORD_AUDUB, are still lost.

related: #26 ping: @Kennethhole @audub

audub commented 8 years ago

Love it. Subfields are quite important though ;)

greut commented 8 years ago

@audub it'll be easier to build something on top of #55 where there is a list of fields already ( field_map).

jirikuncar commented 8 years ago

Why don't you just create rule that matches all 9xx fields?

egabancho commented 8 years ago

@jirikuncar I think this will not only cover the 9xx fields but also other custom fields which MARC21 allows, i.e. 69x, and all the fields that one installation can use to represent its metadata but are not part of the standard. I picture the use of the liberal mode specially when porting the history of a record as we might have fields that are not present in the last version but we want to preserve somehow.

jirikuncar commented 8 years ago

Proposal

What about something like this:

def do(self, blob, ignore_missing=True, exception_handlers=None):
    ...
    handlers = {IgnoreKey: None}
    handlers.update(exception_handlers or {})
    if ignore_missing:
        handlers.setdefault(MissingRule, None)
    ...
    for key, value in iteritems(blob):
        try:
            ...
        except Exception as exc:
            if exc.__class__ in handlers:
                handler = handlers[exc.__class__]
                if handler is not None:
                    handler(exc, output, key, value)
            else:
                 raise
egabancho commented 8 years ago

:+1:

greut commented 8 years ago

@jirikuncar @egabancho Good idea. I've implemented it and the tests were not much impacted. #55 will tell us how solid this currently is...

jirikuncar commented 8 years ago

@greut can you try to give more details in commit message as it is going to be used in release notes?

* NEW Adds new argument ``exception_handlers`` to ... <class.method> ... (addresses #26)