codeforpdx / PASS

PASS project - with HMIS module integration
MIT License
30 stars 26 forks source link

Update document types to use RDF Schema #258

Open timbot1789 opened 1 year ago

timbot1789 commented 1 year ago

Is your feature request related to a problem? Please describe. Currently, all our documents are assigned a type, which is a value from the enum docTypes. However, defining shared values like reduces interoperability, which is a core value proposal of our application. The only way for someone to know what the type of a document means is to run it through the PASS application. Other applications won't be able to effectively interpret it.

Describe the solution you'd like Replace the values in docTypes with urls from schema.org or a similar web ontology. We should be able to find values for driver's license, passport, and bank statement at least.

Additional context The interoperability standards for Solid are not very strong right now, and they are likely to be revised in the next couple years. However, we can get a good head start on interoperability by making sure that any information written out by PASS does not come from the application itself. Instead, it should come from either:

a) the user b) a schema library like schema.org

Some recommended libraries: https://hmis-interop.github.io/ schema.org

Luckynotrich commented 1 year ago

Is this still something? I would be interested in researching a solution. I feel I would need a review of said, before I changed anything

Luckynotrich commented 1 year ago

After some research, I've found that both Drivers License and Passport could be represented by Government Permit. For Bank Statement I found DepositAccount, PaymentService and PaymentCard....I notice that the last three all have pending in the url and after looking at the issues pages found that they all have "Many minor micro data issues" and Deposit Account has a discussion around Cryto

Luckynotrich commented 1 year ago

Is a change just as simple as editing the doc_types.js from this:

const DOC_TYPES = {
  BankStatement: 'Bank Statement',
  Passport: 'Passport',
  DriversLicense: "Driver's License",
  Other: 'Other'
};

to this:

const DOC_TYPES = {
  BankStatement: 'Bank Statement',
  Passport: 'https://schema.org/GovernmentPermit',
  DriversLicense: "https://schema.org/GovernmentPermit",
  Other: 'Other'
};
Luckynotrich commented 1 year ago

Then how would I test to know if they were working?

leekahung commented 1 year ago

For these, you should be using them as part of the RDF schema as oppose to the DOC_TYPES array.

Ideally, you would want to define an RDF predicate which utilizes the schema you've just found and the document type. For example, if you want to add a passport for the RDF, you would use .addStringNoLocal("https://schema.org/GovernmentPermit", DOC_TYPES["Passport"]). As for the schemas, we've been storing them inside src/constants/rdf_predicates.js

tomdieli commented 1 year ago

Not high priority, but I'm a nub, so it seems like as good as any place to start.

Going over the comments it sounds like we first should take the current mappings defined in DOC_TYPES in src/utils/doc_types.js and, once we have replaced the literal values with IRIs( whithin reason ), move them into RDF_PREDICATES in rdf_predicates.js in the same directory.

I only see 2 components using DOC_TYPES at this time: src/components/Form/DocumentSelection.jsx src/components/Documents/DocumentTableRow.jsx

I assume DOC_TYPES(or some other object) would be set in a new or existing file in src/helpers. Either way, a change to the import statements in the above mentioned component files would also be required, along with any other changes if the object is renamed.

timbot1789 commented 1 year ago

@tomdieli Yes, your analysis is correct. And actually, just combining doc_types with RDF_PREDICATES would be a good first PR. That would make it far easier to update the IRIs once we find good ones.

ogorman89 commented 1 year ago

@timbot1789 I'm having a hard time envisioning what that combined table of doc_types and RDF_PREDICATES should look like.

A couple of questions I've collected:

Is something like the inrupt SCHEMA_INRUPT.ts what you have in mind for our schema definition? If so, do we need to incorporate an additional key:value pair for the user-facing strings like those included in doc_types which is currently looped through to populate the document type field in the document upload form?

Ideally, wouldn't our completed schema replace the current reference to the inrupt schema?

I want to confirm the order of precedence for schema definition as:

HMIS -> FOAF -> Schema.org

tomdieli commented 1 year ago

@timbot1789 I'm afraid I must throw in the towel on this guy. Too much I don't understand. I can create a PR to just move the DOC_TYPE literals into RDF_PREDICATES then define DOC_TYPES in rdf_predicates as well, which is what I think you suggested in a previous post. This would get rid of the doc_types file but little else. Not sure if that really has a lot of value on its own.

timbot1789 commented 1 year ago

@tomdieli yeah that's fine. The values I'm looking for may just not exist. If that's the case, we may want to backlog this longer, or roll our own URNs. We'll come back to this later.