alps-io / imports

Collection of imported vocabularies from other sources (e.g. IANA, Schema.org, microformats, etc.) converted into ALPS documents.
https://alps-io.github.io/imports/
7 stars 2 forks source link

Descriptor's href is ambiguous #2

Open tkawa opened 10 years ago

tkawa commented 10 years ago

For example of http://alps.io/schema.org/Person, (https://github.com/alps-io/imports/blob/master/schema.org/output/Person.xml in this repo)

<descriptor id="Person" type="semantic" href="http://alps.io/schema.org/Thing">

http://alps.io/schema.org/Thing doesn't specify another single descriptor. http://alps.io/schema.org/Thing has many descriptors, such as http://alps.io/schema.org/Thing#Thing, http://alps.io/schema.org/Thing#description, http://alps.io/schema.org/Thing#image, etc. It is ambiguous.

ALPS spec http://alps.io/spec/index.html#rfc.section.2.2.6 says,

When it appears as an attribute of 'descriptor' [prop-descriptor], 'href' points to another 'descriptor' either within the existing ALPS document or in another ALPS document.

I'm going to write ALPS parser, so I'd like you to fix the profiles or add an assumption that specifies a single descriptor to the spec. (such as assuming it to be the first root descriptor. I'm going to have this assumption for the present code.)

PS Thank you very much for coming to RESTful Meetup the other day :smile:

dillonredding commented 3 years ago

I also noticed some documents have multiple URLs in the href attribute.

<descriptor id="AccountingService" type="semantic" href="http://alps.io/schema.org/FinancialService http://alps.io/schema.org/ProfessionalService">

I don't think it's clear whether this is valid from the specification:

['href'] contains a resolvable URL.

When it appears as an attribute of a 'descriptor', 'href' points to another 'descriptor' either within the existing ALPS document as a fragment or in another ALPS document as an absolute URL. The URL MUST contain a fragment per Section 2.2.7.2 referencing the related 'descriptor'.

I could be wrong on this, but I don't believe XML supports multiple, space-separated values for attributes out of the box. That'd be an application-specific extension/convention, would it not?

Furthermore, how would this look in JSON? Would href be an array or the same space-separated URL string?

mamund commented 3 years ago

right -- this is not valid to the current draft. at one point we thought this might work, but it doesn't.

there are some other possible ways to support something like this, but it depends on the intent of the author.

if there are way possible definitions/provenances for the same element. you could include two elements w/ the same name property:

<descriptor 
   id="AccountingServiceFinancal" 
   name="AccountingService"
   type="semantic" 
   href="http://alps.io/schema.org/FinancialService"
/>

and

<descriptor 
   id="AccountingServiceProfessional" 
   name="AccountingService"
   type="semantic" 
   href="http://alps.io/schema.org/ProfessionalService"
/>
mamund commented 3 years ago

these imports were created at a time when we wanted to explore how to include parts of other docs, how to compose ALPS voacbularies from exsiting content, etc.

i've come to the point where i don't like external references -- they add quite a burden to the resolver/reader. instead, i like self-contained documents and like to add references to the source of the element. A kind of "citation" to back up the vocabulary word you're using.

<descriptor 
   name="AccountingService"
   type="semantic" 
   src="http://alps.io/schema.org/FinancialService"
/>

this use of src (sometimes, i've used ref) is not in the spec right now and i treat it as an extension (meaning it is not controlled by the spec ATM).

but this whole idea next work.

dillonredding commented 3 years ago

Is there a way to combine those two separate descriptors? Seems like the issue is that href assumes single inheritance, but schema.org supports multiple inheritance. The only way I can think to account for that would be to manually inherit from both of them (i.e., not take advantage of the "inheritance" described in section 2.2.3 of the spec)?

<alps>
  <descriptor id="AccountingService" type="semantic">
    <descriptor href="http://alps.io/schema.org/FinancialService.xml#..." />
    <!-- more FinancialService descriptors -->
    <descriptor href="http://alps.io/schema.org/ProfessionalService.xml#..." />
    <!-- more ProfessionalService descriptors -->
  </descriptor>
</alps>
mamund commented 3 years ago

you an def. combine them in various ways.

the way you express these relationships has a lot do to w/ how you plan to use them to solve real problems. the above expression works when i want to know how some vocabulary elements are related to each other. but i don't think it would be as helpful an expression to those who just want to rely on a single definition of the term. in fact, adding this level of detail might overly complicate my code if all i was interested in was that term to use for professional services.

dillonredding commented 3 years ago

@mamund, what do you think the best approach is for multiple parents in the registry?

mamund commented 3 years ago

not sure of the Q. can you show an example?

dillonredding commented 3 years ago

I suppose my question is: What's the fix for the multiple, space-separated URIs at /alps/descriptor/@href in terms of ALPS representation?

<alps>
  <descriptor id="PhysicalExam" href="http://alps.io/schema.org/MedicalEnumeration.xml#MedicalEnumeration http://alps.io/schema.org/MedicalProcedure.xml#MedicalProcedure">
    <!-- PhysicalExam properties -->
  </descriptor>
</alps>

If the above is not a correct way of communicating multiple parents, what is the "correct" way when it comes to the http://alps.io/schema.org registry? We've discussed two possibilities, but I'm not sure either are ideal for the official registry based on the conversation.

mamund commented 3 years ago

@dillonredding

been thinking about this more and wondering if we could fashion something using the EXT [1] element?

maybe something like

<alps>
  <descriptor id="PhysicalExam" 
    <ext 
      id="physicalExamParents" 
      name="parents" 
      value="http://alps.io/schema.org/MedicalEnumeration.xml#MedicalEnumeration 
      http://alps.io/schema.org/MedicalProcedure.xml#MedicalProcedure" 
    />
    <!-- PhysicalExam properties -->
  </descriptor>
</alps>

lots of variations are possible.

open to ideas.

[1] https://tools.ietf.org/html/draft-amundsen-richardson-foster-alps-03#section-2.2.4

dillonredding commented 3 years ago

That could work, but I feel it's not utilizing the full power of ALPS and maybe limits those using the registry. Specifically, users lose this feature:

If 'descriptor' has an 'href' attribute, then 'descriptor' is inheriting all the attributes and sub-properties of the descriptor pointed to by 'href'.

Without this, all the possible properties in a PhysicalExam, for example, aren't described by the registry, at least not in a way that can utilize descriptor inheritance. By using ext, users are required to implement custom processing logic if the want the inheritance.

This makes me question the purpose of mapping the schema.org vocabulary to ALPS. Perhaps we should back up and discuss the intent of the central ALPS registry. Why does it exist? What problem is it solving?

You've mentioned apprehension toward inheritance, but I think the registry can describe the full hierarchy of schema.org, without creating a lot of bloat for users. Especially, since there's a sort of built-in opt-in/opt-out inheritance feature. For example, if a user wants to utilize the full breadth of a schema, they easily can:

<alps>
  <descriptor id="MyPhysicalExam" href="http://alps.io/schema.org/PhysicalExam.xml#PhysicalExam" />
  <descriptor id="MyPostalAddress" href="http://alps.io/schema.org/PostalAddress.xml#PostalAddress" />
</alps>

If they don't, and I would say this is likely most cases, users can opt-out by simply customizing, but still using the registry:

<alps>
  <descriptor id="MyPhysicalExam">
    <!-- only need a subset of properties -->
    <descriptor id="procedureType" href="http://alps.io/schema.org/MedicalProcedure.xml#procedureType" />
    <descriptor id="status" href="http://alps.io/schema.org/MedicalProcedure.xml#status" />
  </descriptor>

  <descriptor id="MyPostalAddress" href="http://alps.io/schema.org/PostalAddress.xml#PostalAddress">
    <!-- need to customize a property -->
    <descriptor id="addressCountry" type="safe" href="http://alps.io/schema.org/PostalAddress.xml#addressCountry" rt="http://alps.io/schema.org/Country.xml#Country" />
  </descriptor>
</alps>

That said, I may not understand the intent of the registry, so please feel free to correct me on any of this.

mamund commented 3 years ago

@dillonredding

OK, i've offered three possible ways to express multiple inheritance and none seem to be what you expect. at this point, i'm happy to see your suggestion and check on whether it causes any problems for others (i doubt it will). assuming you have a solid proposal, I'll be fine with that.

as for the "purpose" of all this, we wanted to create an approach that consistently expressed schema.org definitions within the ALPS document space. AFAICT, there are multiple ways to do this and what is needed is a consistent approach applied to all the entries. once we have that, i'll be satisfied.

so, let's focus on creating a solid approach and then making sure that's applied consistently throughout the collection.

show me what you have in mind and let's go from there.

dillonredding commented 3 years ago

I apologize if I'm overstepping here, but this would be my proposal:

  1. If a schema has exactly one parent, use the href attribute to refer to the parent.
    <alps>
     <descriptor id="PostalAddress" href="http://alps.io/schema.org/ContactPoint.xml#ContactPoint">
       <doc href="https://schema.org/PostalAddress" />
       <descriptor id="addressCountry" type="semantic">
         <doc href="https://schema.org/addressCountry" />
       </descriptor>
       <descriptor id="addressLocality" type="semantic">
         <doc href="https://schema.org/addressLocality" />
       </descriptor>
       <descriptor id="addressRegion" type="semantic">
         <doc href="https://schema.org/addressRegion" />
       </descriptor>
       <descriptor id="postOfficeBoxNumber" type="semantic">
         <doc href="https://schema.org/postOfficeBoxNumber" />
       </descriptor>
       <descriptor id="postalCode" type="semantic">
         <doc href="https://schema.org/postalCode" />
       </descriptor>
       <descriptor id="streetAddress" type="semantic">
         <doc href="https://schema.org/streetAddress" />
       </descriptor>
     </descriptor>
    </alps>
  2. If a schema has two or more parents, but only one ancestral path contains properties, just use the href attribute:
    <alps>
     <descriptor id="CreativeWorkSeries" href="http://alps.io/schema.org/CreativeWork.xml#CreativeWork">
       <doc href="https://schema.org/CreativeWorkSeries" />
       <descriptor id="endDate" type="semantic">
         <doc href="https://schema.org/endDate" />
       </descriptor>
       <descriptor id="issn" type="semantic">
         <doc href="https://schema.org/issn" />
       </descriptor>
       <descriptor id="startDate" type="semantic">
         <doc href="https://schema.org/startDate" />
       </descriptor>
     </descriptor>
    </alps>

    (Is there any value in specifying "empty" parents?)

  3. If a schema has multiple parents with properties in each path, denorm all ancestor properties:

    <alps>
     <descriptor id="Diet">
       <doc href="https://schema.org/Diet" />
       <descriptor id="dietFeatures" type="semantic">
         <doc href="https://schema.org/dietFeatures" />
       </descriptor>
       <descriptor id="endorsers" type="semantic">
         <doc href="https://schema.org/endorsers" />
       </descriptor>
       <descriptor id="expertConsiderations" type="semantic">
         <doc href="https://schema.org/expertConsiderations" />
       </descriptor>
       <descriptor id="physiologicalBenefits" type="semantic">
         <doc href="https://schema.org/physiologicalBenefits" />
       </descriptor>
       <descriptor id="risks" type="semantic">
         <doc href="https://schema.org/risks" />
       </descriptor>
    
       <!-- CreativeWork properties -->
       <descriptor id="about" href="http://alps.io/schema.org/CreativeWork.xml#about" />
       <!-- ... -->
    
       <!-- MedicalEntity properties -->
       <descriptor id="code" href="http://alps.io/schema.org/MedicalEntity.xml#code" />
       <!-- ... -->
    
       <!-- Thing properties -->
       <descriptor id="additionalType" href="http://alps.io/schema.org/Thing.xml#additionalType" />
       <!-- ... -->
     </descriptor>
    </alps>

I can see how 2 is overkill, and I would have no problem leaving that one out and just going with 1 and 3.

Alternatively, I could see going with 3 and leaving 1 and 2 out entirely. This would require reverting some of the changes in #4 (šŸ˜¬), and perhaps this is what you meant by consistency and what was hinted at by the original structure.

mamund commented 3 years ago

@dillonredding:

thanks for putting this together.

and no problem about "overstepping" or anything -- this is a group effort all around.

i think i see what you're thinking here and i'm good w/ it. as i think i mentioned early on, @leonardr and I were experimenting w/ the notion of imports from other semantic spaces and the schema.org rendition was a first pass after some initial thoughts.

your approach makes sense and is, AFAICT, well within the stated spec. i don't see any important stumblers here yet. i'd like to play with it a bit over the next day or two (takes me a while to ingest stuff like this, sorry).

assuming we all like what's here, what will be needed will be an effort to make sure the imports are consistent throughout the set and the make sure the mods for schema.org don't clash/break approaches we tried w/ IANA and OpenSearch. IOW, we'll need to make a pass there, too, i think.

finally, we should run these all through the validator @filip26 has built (https://github.com/filip26/alps). I don't anticipate any problems but would like to roll his work into our "process" as soon as his CLI is stable.

thanks again for all this.

open for any other feedback from others, too!

cheers.

mamund commented 3 years ago

@tkawa

it hits me that, after all this time, we're finally listening to you and i haven't reached out to see what you think of the adjustments here. my regrets for this mistake.

please feel free to add your feedback and help clarify your original observations and help us make this collection better.

thanks.

dillonredding commented 3 years ago

@leonardr and I were experimenting w/ the notion of imports from other semantic spaces and the schema.org rendition was a first pass after some initial thoughts.

Just want to say I really appreciate all the work you guys have done with this, which is why I want to help. I'd like others to realize the potential I see with ALPS, as well as REST in general.

tkawa commented 3 years ago

My initial comment is about inheritance, and its purpose is to allow for shared understanding through the parent descriptor. For example, a client looking for a PhysicalExam could treat MyPhysicalExam as a PhysicalExam.

Therefore, case 3 of @dillonredding's proposal unfortunately doesn't achieve my goal; clients looking for a CreativeWork or a MedicalEntity can't handle a Diet.

However, it's difficult to successfully introduce multiple inheritance without deviating significantly from the existing specifications of ALPS. Also, there may not be many cases where we want to treat a Diet as a CreativeWork.

With that in mind, I think the proposal is pragmatic. I would add that in case 3, I think it would be better to choose one of the parents and put it in the href (similar to case 2).

It took me a while to sort out my thoughts. Thanks for the suggestion.

mamund commented 3 years ago

@dillonredding @tkawa

looks like we are talking about stylistic things here -- that we can achive what we want w/o changes in the specification, just changes in how we use the specification -- is that correct?

as a follow up, it would be great to make this a talking point at an upcoming Open Office Hours session. not sure about @dillonredding , but i suspect our sessions might be pretty late for @tkawa .

maybe we should come up w/ another time to discuss this "live"? ideas?

dillonredding commented 3 years ago

I'm down for a live discussion. That's usually more productive than asynchronous conversation šŸ˜ I'm in CST.

mamund commented 3 years ago

@dillonredding @tkawa

can we do the discussion on ALPS and inheritance-handling during the scheduled ALPS Open Office Hours (14:30 UTC this thursday)?

if yes, you can use the link that I post in the email list each week and we can cover it then.

dillonredding commented 3 years ago

@mamund, what email list?

mamund commented 3 years ago

aha!

https://groups.google.com/g/alps-io

join up! some good convos there.

Mike Amundsen

APIs, Microservices, and Digital Transformation 7310 Turfway Rd. Suite 550, Florence, KY, 41042

+1.859.372.6517 https://zoom.us/j/8593726715 http://linkedin.com/in/mamund http://g.mamund.com/meetme http://training.amundsen.com http://twitter.com/mamund

On Tue, Nov 10, 2020 at 4:51 PM Dillon Redding notifications@github.com wrote:

@mamund https://github.com/mamund, what email list?

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alps-io/imports/issues/2#issuecomment-724988327, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAJLSH5M5S4CZZHTIARGTLSPGYWPANCNFSM4ARCFLYQ .