google / schemarama

Schemarama is a project exploring standards-based validation for structured data, especially Schema.org.
Apache License 2.0
124 stars 22 forks source link

Current demo validates differently between shex and shacl (e.g. schema.org Event and Recipe samples) #42

Open danbri opened 2 years ago

danbri commented 2 years ago

Relaying from Aaron:

I pasted the first http://schema.org/Event example into the tool and it passed as error free with ShEx validation, but generated 4 errors with SHACL...

(I have verified this -- @danbri)

{
    "@context": "https://schema.org",
    "@type": "MusicGroup",
    "event": [
        {
            "@type": "Event",
            "location": "Memphis, TN, US",
            "offers": "ticketmaster.com/foofighters/may20-2011",
            "startDate": "2011-05-20",
            "url": "foo-fighters-may20-fedexforum"
        },
        {
            "@type": "Event",
            "location": "Council Bluffs, IA, US",
            "offers": "ticketmaster.com/foofighters/may23-2011",
            "startDate": "2011-05-23",
            "url": "foo-fighters-may23-midamericacenter"
        }
    ],
    "image": [
        "foofighters-1.jpg",
        "foofighters-2.jpg",
        "foofighters-3.jpg"
    ],
    "name": "Foo Fighters",
    "track": [
        {
            "@type": "MusicRecording",
            "audio": "foo-fighters-rope-play.html",
            "duration": "PT4M5S",
            "inAlbum": "foo-fighters-wasting-light.html",
            "interactionStatistic": {
                "@type": "InteractionCounter",
                "interactionType": "https://schema.org/ListenAction",
                "userInteractionCount": "14300"
            },
            "name": "Rope",
            "offers": "foo-fighters-rope-buy.html",
            "url": "foo-fighters-rope.html"
        },
        {
            "@type": "MusicRecording",
            "audio": "foo-fighters-everlong-play.html",
            "duration": "PT6M33S",
            "inAlbum": "foo-fighters-color-and-shape.html",
            "name": "Everlong",
            "interactionStatistic": {
                "@type": "InteractionCounter",
                "interactionType": "https://schema.org/ListenAction",
                "userInteractionCount": "11700"
            },
            "offers": "foo-fighters-everlong-buy.html",
            "url": "foo-fighters-everlong.html"
        }
    ],
    "subjectOf": {
        "@type": "VideoObject",
        "description": "Catch this exclusive interview with Dave Grohl and the Foo Fighters about their new album, Rope.",
        "duration": "PT1M33S",
        "name": "Interview with the Foo Fighters",
        "thumbnail": "foo-fighters-interview-thumb.jpg",
        "interactionStatistic": {
            "@type": "InteractionCounter",
            "interactionType": "https://schema.org/CommentAction",
            "userInteractionCount": "18"
        }
    }
}

aaron-schemarama .

danbri commented 1 year ago

@Gnomus042 any idea what is happening here?

danbri commented 1 year ago

@ericprud and I took a look. There is a similar issue for 1st sample too (recipe).

SHACL validation finds the interactionType property's value (http://schema.org/Comment) to fail.

This fixes the example:

  "interactionStatistic": {
    "@type": "InteractionCounter",
    "interactionType": { "@type": "ConsumeAction" },
    "userInteractionCount": "140"
  }, // ...etc

By contrast, ShEx validation doesn't complain.

The official spec for interactiontype is that it has Action values, and is used on InteractionCounter. The current example (test 1) parses as a URI encoded as a string (this is dependent on schema.org context file and could potentially change).

So the instance data looks like this. Something with an interactionType of some string.

_:b1 http://schema.org/interactionType "https://schema.org/Comment" .

Even if this was parsed as a URI, i.e.

_:b1 http://schema.org/interactionType https://schema.org/Comment .

....no type is declared so the validation (being somewhat closed world) should complain that we're not told this is some kind of Action.

Why is ShEx not complaining?

This specific issue seems to trace to the ShEx-generation code that populates validation/shex/raw_shapes directory.

The SHACL and ShEx files generated from Schema.org's own definitions are organized in a type-centric way, so we find the generated rules for interactionType by looking for the type(s) that are schema:domainIncludes of this property.

In fact, this is also puzzling. The only type where interactionType is expected is Action, and yet neither SHACL nor ShEx complain, despite it being used on InteractionCounter, which isn't an Action subtype. This needs investigation.

Meanwhile, the values of interactionType: SHACL complains, ShEx doesn't.

Hypothesis...

In Action.shex in shex/raw_shapes we have this:

<#ValidSchemaAction> @<#ValidSchemaThing> AND EXTRA a {
    a [schema:MarryAction schema:IgnoreAction schema:EatAction schema:ConsumeAction schema:AssignAction schema:UseAction schema:ApplyAction schema:EndorseAction schema:ShareAction schema:FilmAction schema:ControlAction schema:CheckAction schema:DisagreeAction schema:TakeAction schema:DonateAction schema:CreateAction schema:ScheduleAction schema:ReactAction schema:InteractAction schema:ArriveAction schema:DeactivateAction schema:ReserveAction schema:PlanAction schema:BookmarkAction schema:ReviewAction schema:AcceptAction schema:CancelAction schema:BorrowAction schema:MoveAction schema:PaintAction schema:AddAction schema:ChooseAction schema:DiscoverAction schema:ViewAction schema:ExerciseAction schema:DrawAction schema:OrganizeAction schema:ReplaceAction schema:LikeAction schema:VoteAction schema:AchieveAction schema:InviteAction schema:CheckOutAction schema:PerformAction schema:ActivateAction schema:DrinkAction schema:DownloadAction schema:FollowAction schema:RegisterAction schema:TieAction schema:TravelAction schema:UnRegisterAction schema:DislikeAction schema:FindAction schema:SellAction schema:AskAction schema:ReceiveAction schema:AllocateAction schema:AssessAction schema:RejectAction schema:RentAction schema:RsvpAction schema:PlayAction schema:LoseAction schema:ResumeAction schema:TrackAction schema:ConfirmAction schema:SubscribeAction schema:WinAction schema:InformAction schema:WatchAction schema:UpdateAction schema:LendAction schema:OrderAction schema:ReturnAction schema:SendAction schema:AgreeAction schema:TipAction schema:LeaveAction schema:PrependAction schema:GiveAction schema:TransferAction schema:ListenAction schema:QuoteAction schema:JoinAction schema:PhotographAction schema:DepartAction schema:CommentAction schema:PreOrderAction schema:SuspendAction schema:DeleteAction schema:WantAction schema:AppendAction schema:InsertAction schema:CommunicateAction schema:WriteAction schema:BefriendAction schema:ReadAction schema:CookAction schema:SearchAction schema:AuthorizeAction schema:WearAction schema:CheckInAction schema:PayAction schema:ReplyAction schema:TradeAction schema:BuyAction schema:InstallAction schema:Action] * ;

If we replace that final '*' with a '+' it may fix this for ShEx.

Summary

InteractionCounter.shex is satisfied when it shouldn't be, because Action.shex doesn't require typing, since the 'a' line ends with '*' not '+'.

Next steps

danbri commented 1 year ago

For the interactionType property, here is what schema.org release 14 says in the property definition:

(base) danbri-macbookpro4% grep interactionType schemaorg-all-https.nt  
<https://schema.org/interactionType> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<https://schema.org/interactionType> <http://www.w3.org/2000/01/rdf-schema#label> "interactionType" .
<https://schema.org/interactionType> <https://schema.org/domainIncludes> <https://schema.org/InteractionCounter> .
<https://schema.org/interactionType> <https://schema.org/rangeIncludes> <https://schema.org/Action> .
<https://schema.org/interactionType> <http://www.w3.org/2000/01/rdf-schema#comment> "The Action representing the type of interaction. For up votes, +1s, etc. use [[LikeAction]]. For down votes use [[DislikeAction]]. Otherwise, use the most specific Action." .