Closed brynnz22 closed 2 years ago
@brynnz22 Can you upload your config.yml
file too?
yml files are not supported here, so I can't upload it. I've attached a screen shot:
Hey @brynnz22 , I have been trying to replicate your error but am not experiencing any issues with uploading the manifest as a table to synapse. Here is where the table uploaded in my scratch space. Here is where the CSV loaded.
Can you remake your JSONLD then try uploading the table again? If that does not work, there may be some sort of issue if you are re-uploading a table, if there was a change in between uploads to the columns. I was able to keep uploading to the same table, however, with no error.
I personally pulled your JSONLD from your CSBC GitHub, as well as made it from the model your provided and both worked.
I also noticed some things in your .yml
that are not currently causing errors but might in the future.
manifest_basename
should not have .csv
at the end.data_type
should be a list formatted as:
data_type:
- DatasetView
@mialy-defelice thanks for looking into it. I remade the JSONLD. I also installed the latest version of Schematic and I am continuing to get the same error. I am not trying to re-upload a table after making a change to the manifest. I am just trying to do the initial upload. The JSONLD from the CSBC Github is not up to date. Could you try downloading it from this google sheet and let me know if you are able to replicate my error? I attached a screenshot of the error message:
Thanks for the info about the yaml file. I will update it.
Could you try downloading it from this google sheet and let me know if you are able to replicate my error?
I do not have access to ^this google sheet.
I did remake the JSONLD from the CSV model you originally provided. Is this one different? I added the GitHub one as an extra test to see if there was a difference.
That's weird. It says you have access. Maybe I linked something different. Here is the correct link: https://docs.google.com/spreadsheets/d/1hbG1vdi8a0gc-6psn5VgsBgyMl6IOhDSbHWDuC_8EXo/edit#gid=59750547
Oh yes that is the same csv as the google sheet.
Okay... can you send me your JSONLD via Slack? If that does not get me to the error maybe I will try to see if we can get someone else to try to reproduce it.
@linglp can you try to reproduce the error that @brynnz22 is experiencing? I have been testing it, but have not had any issues uploading the manifest table to Synapse.
@brynnz22 and @mialy-defelice Yes I could reproduce the error by running schematic model -c config.yml submit -mp tests/data/mock_manifests/CA184897.csv -d syn33691763 -mrt table
But I think the error could be resolved if you add the -vc
flag. Try something like this: schematic % schematic model -c config.yml submit -mp tests/data/mock_manifests/CA184897.csv -d syn33691763 -vc DatasetView -mrt table
From my end, I could see a list of error messages:
error: For attribute Dataset Assay in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('RNA Sequencing'') and try again.
error: For attribute Dataset Assay in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('RNA Sequencing'') and try again.
error: For attribute Dataset Species in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('Mouse'') and try again.
error: For attribute Dataset Tumor Type in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('Neuroblastoma'') and try again.
error: For attribute Dataset Tumor Type in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('Breast Carcinoma'') and try again.
error: For attribute Dataset Consortium Name in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('ICBP'') and try again.
error: For attribute Dataset Consortium Name in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('ICBP'') and try again.
error: For attribute Dataset Grant Number in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('CA184897'') and try again.
error: For attribute Dataset Grant Number in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('CA184897'') and try again.
error: For attribute Dataset Pubmed Id in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('26907613'') and try again.
error: For attribute Dataset Pubmed Id in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('30755444'') and try again.
error: For attribute Dataset Tissue in row 2 it does not appear as if you provided a comma delimited string. Please check your entry (''') and try again.
error: For attribute Dataset Tissue in row 3 it does not appear as if you provided a comma delimited string. Please check your entry (''') and try again.
error: Validation errors resulted while validating with 'DatasetView'.
@brynnz22 more explanation of the error message: "error: Component 'None' is not present in '~/CSBCDataModel.jsonld', or is invalid."
schematic was trying to find the component that you are validating (because validation also gets triggered when you are submitting manifests), and since you didn't specify -vc
flag, schematic thinks the component is "None" (and it makes sense that "None" is not present in the data model)
Thank you @linglp this is super helpful. I will use the -vc flag in my command. I'm also wondering about the specific error messages you listed above.
error: For attribute Dataset Assay in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('RNA Sequencing'') and try again.
error: For attribute Dataset Assay in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('RNA Sequencing'') and try again.
error: For attribute Dataset Species in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('Mouse'') and try again.
error: For attribute Dataset Tumor Type in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('Neuroblastoma'') and try again.
error: For attribute Dataset Tumor Type in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('Breast Carcinoma'') and try again.
error: For attribute Dataset Consortium Name in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('ICBP'') and try again.
error: For attribute Dataset Consortium Name in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('ICBP'') and try again.
error: For attribute Dataset Grant Number in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('CA184897'') and try again.
error: For attribute Dataset Grant Number in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('CA184897'') and try again.
error: For attribute Dataset Pubmed Id in row 2 it does not appear as if you provided a comma delimited string. Please check your entry ('26907613'') and try again.
error: For attribute Dataset Pubmed Id in row 3 it does not appear as if you provided a comma delimited string. Please check your entry ('30755444'') and try again.
error: For attribute Dataset Tissue in row 2 it does not appear as if you provided a comma delimited string. Please check your entry (''') and try again.
error: For attribute Dataset Tissue in row 3 it does not appear as if you provided a comma delimited string. Please check your entry (''') and try again.
error: Validation errors resulted while validating with 'DatasetView'.
Do you know why it is giving these errors? It appears that it thinks there are empty strings after the entries, but from my end, there do not appear to be any. Basically, it is giving an error for all of the attributes in the manifest that are lists.
@brynnz22 for example, for the first line of error message, it is saying that column "Dataset Assay" should accept a list. (I also double checked the CSV version data model and found out that the validation rule of "Dataset Assay" is indeed list. )
But your entry "RNA Sequencing" in the manifest is not a comma separate list.
For the last line of error message, it appears that you did not input a value for column "Dataset Tissue". But based on the data model, this is a required column and accepts "list".
@linglp yes dataset assay along with the other attributes should accept lists, but many will just have one entry. Can Schematic accept lists with only one item in them? This is pretty important.
You are correct about the tissue attribute, I will have to fix the manifests to list something for the empty values.
@brynnz22 If you could change the list
in the data model to list like
, then schematic would allow single entry (without it being a comma separated list). list
by default is treated as list strict
which means that you have to enter a comma separated list.
That's great to know. Thank you!
@linglp can we reopen this issue? I tried to use your -vc solution and it is now giving this error: error: Component 'DatasetView' is not present in '/Users/bzalmanek/Documents/csbc_schematic/schematic/tests/data/csbc.model.jsonld', or is invalid.
Would it be possible to meet over a zoom call to go over this? I am not sure if I have Schematic installed incorrectly or if it is something else.
@brynnz22
Hi Brynn! I tried schematic model -c config.yml submit -mp tests/data/mock_manifests/CA184897.csv -d syn33691763 -vc DatasetView -mrt table
again, and I can't reproduce your error.
But I would recommend the following:
schematic schema convert test/data/CSBCDataModel.csv
Please note that CSBCDataModel.csv
should only contain the first sheet of the CSBCDataModel.csv
that you provided above {
"@id": "bts:DatasetView",
"@type": "rdfs:Class",
"rdfs:comment": "The denormalized manifest for dataset submission.",
"rdfs:label": "DatasetView",
And yes, if you still get the error, we could set up a zoom call
@linglp yes, I have followed those exact steps many times and I continue to get the error. My json ld looks exactly like that
I will see if Victor I am working with can get it to work on his computer.
@brynnz22 thanks. Let me know if you figure it out. If not, we could schedule a zoom call. We released a new version of schematic in July this year, so if you are using that version, it should be fine too.
@linglp yes, can we please meet? Victor is getting a completely different error.
@brynnz22 feel free to message me on slack. I also tried to pip install schematic again (the version that I installed here is 22.7.1. which is the latest version that we released in July) , and I still can't reproduce your error
This issue happened because the JSON-LD data model was generated by using an older version of schematic. The issue could be resolved by installing the latest version of schematicpy from PYPI. @brynnz22 Let me know if you have other questions.
Describe the bug When attempting to upload a manifest as a table, the following error occurs
error: Component 'None' is not present in '/Users/bzalmanek/Documents/csbc_schematic/schematic/tests/data/csbc.model.jsonld', or is invalid.
Attached are the data model and an example manifest.To Reproduce Steps to reproduce the behavior:
schematic model -c /Users/bzalmanek/Documents/csbc_schematic/schematic/config.yml submit -mp /Users/bzalmanek/Desktop/split_tables-MAY_INCLUDED/dataset/CA184897.csv -d syn33691763 -mrt table
with own values for the config.yml file, manifest path, and dataset id.Expected behavior A table will upload without the error.
Priority (select one)
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (if applicable, please complete the following information):
Additional context CSBC Data Model CA184897.csv