inveniosoftware-attic / jsonalchemy

JSONAlchemy
GNU General Public License v2.0
0 stars 10 forks source link

parser: wrong type of producer rule after parsing #8

Closed MSusik closed 9 years ago

MSusik commented 9 years ago

Let's consider Inspire authors record configuration:

https://github.com/MSusik/inspire-next/blob/A1_authorrecords/inspire/modules/authors/recordext/fields/author.cfg

I noticed that the field _curators_note is not produced when legacy_export_as_marc is used. The problem is that inside producer there is an iteritem function. The function get_producer_rules unfortunately returns set for _curators_note.

I tried to find the source of the problem. It seems that it lies in parser.py. Inside _create_rule I inserted:

        rule_dict['rules'] = rules

        if len(rule) > 2 and 'rule' in rule[2]:
            # Majority of fields store their producers here.
            print rule[2]['rule']
        if json_id == '_curators_note':
            # In _curators_note case it's the last field.
            print rule[-1]['rule']

       if rule.override:

And I received:

{'111__e': 'acronym', '111__d': 'date', '111__g': 'conference_id', '111__a': 'title', '111__c': 'place', '111__b': 'subtitle', '111__9': 'source', '111__y': 'closing_date', '111__x': 'opening_date'}
{'711__b': 'hidden', '711__a': 'public'}
{'411__n': 'number', '411__a': 'title'}
{'110__t': 'new_name', '110__u': 'affiliation', '110__b': 'department', '110__x': 'obsolete_icn', '110__a': 'name'}
{'520__9': 'source', '520__a': 'summary', '520__h': 'hepdata_summary'}
{'693__a': 'accelerator', '693__e': 'experiment'}
{'902__a': ''}
{'003': ''}
{'541__a': 'source', '541__b': 'email', '541__c': 'method', '541__e': 'submission_number'}
{'100__m': 'e_mail', '100__q': 'alternative_name', '100__i': 'INSPIRE_id', '100__v': 'original_affiliation_string', '100__e': 'relator_term', '100__u': 'affiliation', '100__a': 'full_name', '100__j': 'external_id'}
{'490__v': 'volume', '490__a': ''}
{'084__a': '', '084__2': 'standard', '084__9': 'source'}
{'961__c': 'modification_date', '961__x': 'creation_date'}
{'710__': ''}
{'110__a': ''}
{'542__3': 'material', '542__u': 'url', '542__d': 'holder', '542__f': 'statement'}
{'981__a': ''}
{'250__a': ''}
{'6531_a': '', '6531_9': 'source'}
{'536__c': 'grant_number', '536__a': 'agency', '536__f': 'project_number'}
{'595__a': '', '595__b': 'cern_reference'}
{'260__a': 'place', '260__b': 'publisher', '260__c': 'date'}
{'020__b': 'medium', '020__a': ''}
{'041__a': ''}
{'540__u': 'url', '540__3': 'material', '540__b': 'imposing', '540__a': 'license'}
{'595__a': 'value'}
{'500__a': 'value', '500__9': 'source'}
{'269__c': 'date'}
{'773__0': 'recid', '773__o': 'conf_acronym', '773__n': 'journal_issue', '773__c': 'page_artid', '773__z': 'isbn', '773__y': 'year', '773__x': 'pubinfo_freetext', '773__r': 'reportnumber', '773__p': 'journal_title', '773__w': 'cnum', '773__v': 'journal_volume', '773__t': 'confpaper_info'}
{'300__a': ''}
{'999C5y': 'year', '999C5p': 'publisher', '999C5x': 'raw_reference', '999C5s': 'journal_pubnote', '999C5r': 'report_number', '999C51': 'texkey', '999C50': 'recid', '999C5u': 'url', '999C5t': 'title', '999C5i': 'isbn', '999C5h': 'authors', '999C5o': 'number', '999C5m': 'misc', '999C5c': 'collaboration', '999C5a': 'doi', '999C5q': 'maintitle', '999C5e': 'editors'}
{'65017a': 'term', '650179': 'source', '650172': 'scheme'}
{'785__r': 'reportnumber', '785__z': 'isbn', '785__w': 'recid'}
{'695__e': 'energy_range', '695__a': 'keyword', '695__2': 'classification_scheme'}
{'502__d': 'date', '502__c': 'university', '502__b': 'degree_type'}
{'701__u': 'affiliation', '701__i': 'INSPIRE_id', '701__j': 'external_id', '701__a': 'full_name'}
{'210__a': ''}
{'245__a': 'title', '245__9': 'source', '245__b': 'subtitle'}
{'246__a': 'value', '246__b': 'subtitle', '246__9': 'source'}
{'247__b': 'subtitle', '247__a': 'main'}
{'242__a': 'value', '242__b': 'subtitle'}
{'8564_y': 'description', '8564_3': 'material_type', '8564_u': 'url', '8564_w': 'doc_string'}
{'8564_f': 'name', '8564_y': 'description', '8564_z': 'comment', '8564_q': 'eformat', '8564_s': 'size', '8564_u': 'url'}
set(['', '667__a'])

As you can see, _curators_note has different type than the rest. It is propagated further and causes json_for_marc to fall into exception.

Rules are created in rules = parser.parseFile(field_file, parseAll=True), and this is pyparsing stuff. I don't know this library, so I don't think that I can help here.

egabancho commented 9 years ago

You have a syntax error in your producer rule:

json_for_marc(), {"667__a", ""}

should be:

json_for_marc(), {"667__a": ""}

Can you please verify if it works with the fix so we can close the issue?

MSusik commented 9 years ago

Thank you, it solved the problem.