RalphTro / epcis-event-hash-generator

ALGORITHM and SOFTWARE PROTOTYPE to uniquely identify/validate the integrity of any EPCIS event through a common, syntax-agnostic approach based on hashing. Takes an EPCIS Document (formatted in either XML or JSON-LD) and returns the corresponding hash value(s).
MIT License
8 stars 4 forks source link

General issue with processing JSON-LD files (PYPI package) #105

Closed RalphTro closed 1 year ago

RalphTro commented 1 year ago

Dear @Echsecutor, I just figured out some interesting/weird behaviour: when executing the code for e.g. ReferenceEventHashAlgorithm.jsonld, it works fine in our current masterbranch. It does NOT though if I use the current package on PYPI, both on my Mac and Windows machine.
When I leverage the latter, the package throws a number of exceptions, e.g.:

pyld.jsonld.JsonLdError: ('Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.',)
Type: jsonld.InvalidUrl
Code: loading remote context failed
Details: {'url': 'https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld', 'cause': JsonLdError('Could not retrieve a JSON-LD document.')}

This also holds true for other jsonld files. (The package still works fine for XML EPCIS events though. )

Once you have some bandwidth - would you mind having a look into this?

Many thanks in advance.

@dakbhavesh : FYI as well. Maybe you have an idea why this happens and/or a brief feedback whether you are able to reproduce this issue.

Kind regards, Ralph

dakbhavesh commented 1 year ago

Hi @RalphTro, I also ran into the same issue on my Mac with latest version i.e. 1.8.0. It looks like the package is trying to access the context file from https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld and is unable to do so due to CORS policy where that file is hosted.

What is happening in code? file : json_to_py.py method: _bare_string_pre_preocessing line no: 205 line : expanded = jsonld.expand(json_obj)

Here code is trying to expand JSON object keys and needs to hit the context URL to expand the keys. For some reason, hitting context URL i.e https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld doesn't work

Possible reasons could be: https://www.phind.com/search?cache=c67d718a-131e-4bb5-9da6-28e6a6ed6300

May be @Echsecutor you can shed some light on why hitting context from the package would be an issue. This was working perfectly fine with prior versions i.e. 1.6.x. I haven't seen any major changes in the dependent library for a while.

RalphTro commented 1 year ago

Thanks for confirming this/being able to reproduce the issue, @dakbhavesh!

Echsecutor commented 1 year ago

Hi both! Thanks for discovering the issue. CORS is a client side issue, i.e. this may well be due to python updates/version differences on your local machines. Certainly the LD context fetching should NOT mind CORS, this does not make sense.

Echsecutor commented 1 year ago

What is the command you are runnig to generate the above error?

Echsecutor commented 1 year ago
$ python -m epcis_event_hash_generator -p epcis-event-hash-generator/tests/examples/ReferenceEventHashAlgorithm.jsonld 

Hashes of the events contained in 'epcis-event-hash-generator/tests/examples/ReferenceEventHashAlgorithm.jsonld':
ni:///sha-256;6ae96341e0acc6d7a261364751f60e68278a81cdf51da0abb6b4e617014e39d7?ver=CBV2.0

Pre-hash strings:
eventType=ObjectEventeventTime=2020-03-04T10:00:30.000ZeventTimeZoneOffset=+01:00epcListepc=https://id.gs1.org/00/040123450000001112epc=https://id.gs1.org/00/040123450000002225epc=https://id.gs1.org/00/040123450000003338action=OBSERVEbizStep=https://ref.gs1.org/cbv/BizStep-departingreadPointid=https://id.gs1.org/414/4012345000115/254/987{https://ns.example.com/epcis/}myField1{https://ns.example.com/epcis/}mySubField1=2{https://ns.example.com/epcis/}mySubField2=5{https://ns.example.com/epcis/}myField2=0{https://ns.example.com/epcis/}myField3{https://ns.example.com/epcis/}mySubField3=1{https://ns.example.com/epcis/}mySubField3=3

Works with the latest PyPI version 1.9.0, as far as I can tell

Echsecutor commented 1 year ago

Actually https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld should not be fetched from the internet in the first place. I have added https://github.com/RalphTro/epcis-event-hash-generator/blob/master/epcis_event_hash_generator/file_document_loader.py in order to load exactly this context from file instead

Echsecutor commented 1 year ago

if you could run with log level DEBUG, you should see the log messages (logging.debug("Loading %s from file", url) or logging.debug("Fallback: Loading %s from the internet", url)). In my case:

2023-05-16 15:20:05,405 loader (52) [DEBUG]:    Loading https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld from file

as expected

Echsecutor commented 1 year ago

Ah, I think I found (part of) the problem. It seems like the files for the file loader where not packaged https://files.pythonhosted.org/packages/2c/6f/3d8c4085b5fec6d023f9bd600af14a642df2857c4b14b59283ac74bec424/epcis-event-hash-generator-1.9.0.tar.gz

They are now included https://files.pythonhosted.org/packages/e9/07/93a7ecef4507ad7dbcb33aa72f11ad33dea0cf68a236553dbc634aaee230/epcis-event-hash-generator-1.9.3.tar.gz

Echsecutor commented 1 year ago

@RalphTro if you can confirm that this fixes the issue we can close it

dakbhavesh commented 1 year ago

Hi @Echsecutor, With the latest version of the package, and PyPi event hash gets generated successfully.

(base) bhavesh.shah@Bhaveshs-MacBook-Pro-3 examples % python3 -m epcis_event_hash_generator -p ReferenceEventHashAlgorithm.jsonld

Hashes of the events contained in 'ReferenceEventHashAlgorithm.jsonld':
ni:///sha-256;6ae96341e0acc6d7a261364751f60e68278a81cdf51da0abb6b4e617014e39d7?ver=CBV2.0

Pre-hash strings:
eventType=ObjectEventeventTime=2020-03-04T10:00:30.000ZeventTimeZoneOffset=+01:00epcListepc=https://id.gs1.org/00/040123450000001112epc=https://id.gs1.org/00/040123450000002225epc=https://id.gs1.org/00/040123450000003338action=OBSERVEbizStep=https://ref.gs1.org/cbv/BizStep-departingreadPointid=https://id.gs1.org/414/4012345000115/254/987{https://ns.example.com/epcis/}myField1{https://ns.example.com/epcis/}mySubField1=2{https://ns.example.com/epcis/}mySubField2=5{https://ns.example.com/epcis/}myField2=0{https://ns.example.com/epcis/}myField3{https://ns.example.com/epcis/}mySubField3=1{https://ns.example.com/epcis/}mySubField3=3
RalphTro commented 1 year ago

Dear @Echsecutor, Many thanks for the update.

And am glad to hear that it at least works for you and Bhavesh.

As for me: I upgraded the module on both machines to the most current version (1.9.3), and I installed the latest version of Python on my Mac. On both machines (on WIndows, I still work with a quite recent Python release, i.e. 3.10.2), I can successfully process xml files, but still no jsonld files. E.g., if I run the following command...

python -m epcis_event_hash_generator ReferenceEventHashAlgorithm.jsonld

... I I still get a number of errors - here is an excerpt (in this case, from Windows):

...
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Python310\\lib\\site-packages\\epcis_event_hash_generator\\e532647e8eb371379b8b0e8602d8981c8566bc60f7351f22c76a5bc865962008'
...
pyld.jsonld.JsonLdError: ('Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.',)
Type: jsonld.InvalidUrl
Code: loading remote context failed
Details: {'url': 'https://ref.gs1.org/standards/epcis/2.0.0/epcis-context.jsonld', 'cause': JsonLdError('Could not retrieve a JSON-LD document.')}
...

Any ideas what I can do on my side/why this is happening?

Many thanks in advance for sharing your thoughts and kind regards, Ralph

Echsecutor commented 1 year ago

Checking with Ralph:

After pip install of our latest version https://pypi.org/manage/project/epcis-event-hash-generator/release/1.9.3/ the installation directory does not contain the .jsonld files from the package.

Investigating...

Echsecutor commented 1 year ago

Concretely: the release files https://pypi.org/manage/project/epcis-event-hash-generator/release/1.9.3/ vontain the jsonld files, but the installation via pypi on windows/OSX seems not to contain the files (tested by @RalphTro )

Echsecutor commented 1 year ago

noticed that the request document loader sneakily depends on the requests package whithout the package depending on it. -> Need to add that to our requirements

RalphTro commented 1 year ago

Dear @Echsecutor, Happy to confirm that - after having intalled requests package - this works fine again both on my Mac as well as on my Windows machine. From my POV, we can close this issue, or is there anything left to do from your perective? Kind regards, Ralph