Closed linglp closed 2 years ago
@brynnz22 for awareness
@GiaJordan Let me know if you have time to take a look. Thank you!!
The original issue is here: https://github.com/Sage-Bionetworks/schematic/issues/845
After comparing the error log generated by using CSBCDataModel_old.jsonld
and CSBCDataModel_new.jsonld
, I think the problem might be that the following error did not get printed in the console.
[[2, 0, "'' is not one of ['Aorta', 'Endothelium', 'Tendon', 'Head and Neck', 'Bladder', 'Intestine', 'Epithelium', 'Ovary', 'Olfactory Mucosa', 'Uterus', 'Kidney', 'Periodontal Ligament', 'Hippocampus', 'Gastrointestinal Tract', 'Not Applicable', 'Vagina', 'Intrathoracic Lymph Nodes', 'Pending Annotation', 'Synovial Membrane', 'Fallopian Tube', 'Gastroesophageal Junction', 'Cornea', 'Thymus', 'Stomach', 'Respiratory System', 'Brain', 'Esophagus', 'Placenta', 'Hematopoietic System', 'Peripheral Nerves', ", ''], [3, 0, "'' is not one of ['Aorta', 'Endothelium', 'Tendon', 'Head and Neck', 'Bladder', 'Intestine', 'Epithelium', 'Ovary', 'Olfactory Mucosa', 'Uterus', 'Kidney', 'Periodontal Ligament', 'Hippocampus', 'Gastrointestinal Tract', 'Not Applicable', 'Vagina', 'Intrathoracic Lymph Nodes', 'Pending Annotation', 'Synovial Membrane', 'Fallopian Tube', 'Gastroesophageal Junction', 'Cornea', 'Thymus', 'Stomach', 'Respiratory System', 'Brain', 'Esophagus', 'Placenta', 'Hematopoietic System', 'Peripheral Nerves', ", '']]
Summary of convo with @linglp:
JSONSchema validation errors are not and have not been displayed as error messages in the same way as validation rules. Options: add methods to GenerateError
class to format and display valid values
errors, or for users to run the manifest validation apart from submission to see full list of errors. Most likely don't want to display full list of errors at submission to avoid overcrowding output
@brynnz22 I know that we also discussed this issue. Please also refer to Gianna's comment above. After looking more into the code, we found out that JSONSchema validation errors (i.e. when you didn't fill out "Dataset Tissue" which is a required column in JSON Schema) have not been displayed in the same way as violating validation rules. I have a PR in the process to change that, but in the meantime, you could use the validate
command: schematic model -c config.yml validate -mp /Users/lpeng/Documents/test-schematic/CA184897.csv -dt DatasetView
, and see full error messages.
@linglp I am not exactly sure what your and @GiaJordan conversation was, so I am a little lost on these comments. If I use the validation command schematic model -c config.yml validate -mp /Users/lpeng/Documents/test-schematic/CA184897.csv -dt DatasetView
I get a very long output saying everything is invalid when testing on a manifest because most of the attributes are "list like" and Schematic is not recognizing single value lists. Even if I fill in the Dataset Tissue
column with correct values. Let me know what you mean for me to do as I don't understand.
@brynnz22 You shouldn't be seeing a long list of error messages if you have re-generated your JSON-LD after updating the old data model (and updated from list
to list like
). I will send you the JSON-LD and manifest that I used over slack.
Here is the validation errors I get, which appear to be a bug - e.g. Mouse
is a valid value. I used the same jsonld and manifest attached. Tagging @vpchung
Note: The issue that Brynn mentioned about validation is an issue for the released version 22.7.1, and the develop branch is working as expected.
Hi @linglp - thanks for the quick update.
Because we're on a tight timeline to get a new manifest up, would you recommend we switch to the develop
branch to validate, seeing that that is working as expected?
Hi @vpchung ! Yes, you could use the develop branch (To use the develop
branch, please refer to the main ReadMe documentation here). In the meantime, we are trying to do a release this week (but we would still need some time to review the backlogs and also review all the PRs that we plan to include).
Sounds great, thanks @linglp ! We really appreciate your team's effort on this - looking forward to the release! We'll try using the develop
branch in the mean time.
Describe the bug If you convert a CSV data model with "list like" as one of the validation rules, the JSON-LD file that it generated seems to not being able to list errors when later used for manifest submission.
My guess is that there's something wrong with the json-ld generated by schematic if the data model csv contained "list like"
To Reproduce Steps to reproduce the behavior:
develop
schematic schema convert /Users/lpeng/Documents/schematic-git2/schematic/tests/data/CSBCDataModel_new.csv
config.yml
so that it points to theCSBCDataModel_new.jsonld
schematic model -c config.yml submit -mp /Users/lpeng/Documents/test-schematic/CA184897.csv -d syn33691763 -vc DatasetView -mrt table
UPGRADE AVAILABLE
A more recent version of the Synapse Client (2.6.0) is available. Your version (2.5.1) can be upgraded by typing: pip install --upgrade synapseclient
Python Synapse Client version 2.6.0 release notes
https://python-docs.synapse.org/build/html/news.html
[####################]100.00% 1/1 Done... Downloading [####################]100.00% 23.5MB/23.5MB (8.3MB/s) SYNAPSE_TABLE_QUERY_101701260.csv.synapse_download_101701260 Done... JSON schema successfully generated from schema.org schema! JSON schema file log stored as tests/data/CSBCDataModel_new.DatasetView.schema.json /Users/lpeng/Library/Caches/pypoetry/virtualenvs/schematicpy-9OomxyhV-py3.9/lib/python3.9/site-packages/jinja2/environment.py:1088: DeprecationWarning: 'soft_unicode' has been renamed to 'soft_str'. The old name will be removed in MarkupSafe 2.1. return concat(self.root_render_func(self.new_context(vars))) 0 expectation(s) included in expectation_suite. Calculating Metrics: 0it [00:00, ?it/s] error: Validation errors resulted while validating with 'DatasetView'.