Open brynnz22 opened 2 years ago
Related to this, I would request that the JSON-LD is not generated with the bts model included by default and that we have more control of defined namespaces, because we might register the model elsewhere at e.g. BioPortal or identifiers.org/.
@milen-sage could we schedule this one to be addressed in Sprint 12?
@ychae yes, please feel free to triage it.
Note: first step is improving docs. The base schema by default is biothings, but it's easy to configure schematic to use something else or not use a base schema. SH can include instructions on that; we should also add as an item in the data modeling workshop, perhaps in the things to consider section.
AC:
@milen-sage will set up meeting with @brynnz22
This is an easy fix if one is using the developer mode of schematic (to be covered in our workshop). However, we can make it easier.
Currently, a user can go to schematic/utils/io_utils.py
and change
data_path = "data_models/biothings.model.jsonld"
to
data_path = "data_models/schema_org.model.jsonld"
This would just load schema.org. If a user has a simpler schema to start with (or no schema); this path can be adjusted further.
Note that the user should have their schematic developer mode setup, so that these changes are reflected in the CLI - either by rebuilding the schematic package with poetry; or, by using the poetry shell when making changes.
Instead of the current process, we should just provide a config option for the default schema to load; we can still use biothings as a default for now, but can also move to schema.org or a different schema in the future.
@milen-sage sorry to come back to this issue. I changed the io.utils file as specified above, and I tried changing our data model back to using "Cell" as one of our valid values that clashed with biothings, and I am continuing to get a displayName error. Is it possible that "Cell" is clashing with Schema.org as well?
@brynnz22 could you post the updated json-ld schema? Thanks!
@milen-sage I sent it to you via slack. jsonld is not supported here. The only differences are that I changed the Plain Text
valid value for the Tool Input/Output Data
attributes to Text
. This impacts the generation for the ToolView and Tool manifests. I also changed the Cell (Journal)
valid value for the Publication Journal
attribute to Cell
. This impacts the generation of the PublicationView and Publication manifests.
@GiaJordan
Encountering issues when using the schema.org
base schema even with our example data model and manifest. A KeyError
exception is raised during the process of loading schema into network. It appears to be due to variations in the structure of the base schema.org
and BioThings
schemas. Next steps include a new method of unpacking the schema that works on both schemas, as well as other possible future ones as mentioned above.
Related to #810 and #856
cc: @milen-sage
After speaking with @milen-sage and using a newer version of the shema.org
schema, I'm able to load it in as the base schema during manifest generation. When generating a Publication
manifest as described above, the KeyError
for a missing diaplayName
key in a node dictionary still persists, so more investigation is necessary at this time
Hmm, I wonder if this is also related to #888. Let's keep investigating. A fix might resolve both 🤞
Partially addressed in #903, work will continue to determine cause of error and relation to base schema used Feature request will be tracked here #856 @brynnz22 @anngvu
It's like getting in on fucking sunlight before there was fucking sunlight. u understand?
Looks like the internet is catching up. Spam message has been reported.
@milen-sage I know this has a long history and a lot of linked issues (several resolved). This is included on the MC2 plan so I am wondering if it's still an issue. I can add to Jira as a research to confirm if the issue is still relevent unless you already know.
@AmyHeiser yes this is still an issue - @mialy-defelice has captured it to address as part of the refactor (specifying custom base schema or no base schema on which data models extend).
Linking new Jira (combined issue) https://sagebionetworks.jira.com/browse/FDS-69
Describe the bug Many of the biothings namespaces clash with terms used in the CSBC data model, and other data models. For example the attribute,
Publication Journal
, in the CSBC data model has a valid value ofCell
. Because this is a term in Biothings, schematic will give a displayName error and will not generate the manifest. I know this is already on @milen-sage radar.To Reproduce Steps to reproduce the behavior:
Cell
orProtein
)Expected behavior A manifest would generate without errors.
Priority (select one)
Desktop (if applicable, please complete the following information):