IHEC / ihec-ecosystems

This repo is for code and documentation associated with the ihec-ecosystems working group
Apache License 2.0
5 stars 6 forks source link

Fix relative path for all schemas #83

Closed zxenia closed 4 years ago

zxenia commented 4 years ago

This PR is to fix an error about the incorrect relative path while trying to run validateHub.py in IHEC_Data_Hub.

The relative path should point upwards in the hierarchy - therefore two dots should be used.

e.g. {"$ref": "file:../schemas/json/1.1/dataset.json"}

sitag commented 4 years ago

this will break xml validator. xml validator can no longer be run from inside version_metadata, since to use it as a library and as a module we need . imports.

sitag commented 4 years ago

@zxenia Can you roll these changes into pr from @juettemann https://github.com/IHEC/ihec-ecosystems/pull/91

sitag commented 4 years ago

@zxenia @juettemann you can put whatever file://... you want in schemas, just adjust https://github.com/IHEC/ihec-ecosystems/blob/feb2020/version_metadata/hack.py#L8 accordingly.

sitag commented 4 years ago

will fix with https://github.com/IHEC/ihec-ecosystems/pull/91

zxenia commented 4 years ago

Thanks @sitag !

zxenia commented 4 years ago

Hi @sitag, @juettemann

I looked into an issue with relative paths in validateHub.

The path to the main schema - hub.json is set in validateHub (it can be anything), the error happens when hub.json refers to a subschema with a relative path being e.g. {"$ref": "file:./schemas/json/1.1/dataset.json"} so it couldn't find the subschema because it looks for it in the current directory because relative references will be resolved relative to the working directory, not relative to the schema file they came from.

Only main schema hub.json is called in validateHub, the rest is called when validation happens and follows the relative paths.

I went through your code here https://github.com/IHEC/ihec-ecosystems/blob/feb2020/version_metadata/prevalidate.py

As I see you parse the relative paths here: https://github.com/IHEC/ihec-ecosystems/blob/feb2020/version_metadata/prevalidate.py#L20

so the path to the schema is e.g. 'schemas/json/1.1/dataset.json'

Does this mean you call all subschemas separately in prevalidation? If not, how are the subschemas resolved relatively to each other? Currently, how the validateHub works - it calls only hub.json and the rest of schemas pulled via relative references. Thanks!

sitag commented 4 years ago

@zxenia your prevalidate reference is out of sync. there are no references to file: anywhere in prevalidate.py
you can adust the relative paths for all schema files in schema/json/$version/*json so, any file:./schemas/json/... reference can be file:../schemas/json/... if you want just commit what paths you need to @juettemann pr. we can add the fix there.