WGBH / PBCore2.0

Public Broadcasting Metadata Dictionary Project
http://www.pbcore.org
32 stars 9 forks source link

Allow for @source declarations for both title and titleType #48

Open kvanmalssen opened 10 years ago

kvanmalssen commented 10 years ago

Currently, titleType is an attribute of the title element. As is, this doesn't allow implementors to declare the source of the titleType vocabulary using the @source attribute and/or potentially creates a conflict for the use of the @source attribute. As all attributes should refer to the value of the element, titleType needs to become its own element to support this requirement.

Recommend the following change: //pbcoreTitle //pbcoreTitle/title //pbcoreTitle/titleType

/title an /titleType would use the sourceVersionStringType

dmaccarn commented 10 years ago

I'm not sure I understand why you need a sourceVersionStringType for the title? Are you just trying to clear some confusion around the source referring to type v. title? Might it be less of a change to add an attribute group around titleType? e.g. titleTypeSourceVersionGroup?

awead commented 10 years ago

I think the confusion @kvanmalssen is trying to clear up is that source is supposed to refer to the element, which would be pbcoreTitle, but declaring an authority source for a title isn't as necessary as it is for a type of title. However, using source to inform titleType is technically wrong.

Defining a titleTypeSourceVersionGroup would be consistent with others such as segmentTypeSourceVersionGroup and descriptionTypeSourceVersionGroup but having additional elements for title and titleType would work as well, and if you choose to, remove sourceVersionStringType from the title element altogether to make the implementation usage explicit... if I understand correctly :smile:

dmaccarn commented 10 years ago

Would we have better schema version compatibility if we used "titleTypeSourceVersionGroup" attributes since most of them are optional; as opposed to element changes?

kvanmalssen commented 10 years ago

Hi all,

Thanks for your comments. I thought we had come to an agreement on this one in our last schema committee call, but perhaps there are still kinks to iron out.

What I envision is to be able to do this:

<pbcoreTitle> <title @source="PBS Learning Media>The Wheels of Time</title> <titleType @source="PBCore Title Type Vocabulary">Episode</titleType> </pbcoreTitle

So that you can declare a source for both title and titleType. If @titleType is an attribute, you cannot do this, and it leads to confusion over the target of the attribute (which should be the value in the element).

I suggested sourceVersionGroup as the type for both of these elements because I can see that you might want to use the full set attributes that this type provides. I wasn't envisioning a need for any additional attributes that would require a new title specific for titles or titleTypes.

awead commented 10 years ago

Makes sense to me. :+1:

kvanmalssen commented 10 years ago

@dmaccarn you make a good point that this is not agreed upon, merely flagged for review. Glad we are using GitHub to trace the discussion. We will raise this in the next schema team meeting.

dmaccarn commented 10 years ago

I don't remember us agreeing a schema change. I thought it was only brought up as an example of things needing review.

What about the "startEndTimeGroup"? Still tied to the title? But then it wouldn't be tied to the titleType, does this cause confusion? Wouldn't the best way to treat pbcoreTitle is like we do pbcoreDescription and have all the attributes in pbcoreTitle as a group. And if a title version is needed add that version group also. This should make it backward compatible, in that current 2.0 records would validate. No?

e.g.

<!-- titleStringType -->
   <xsd:complexType name=“titleStringType">
      <xsd:annotation>
         <xsd:documentation>The titleType is a complex group of attributes that help define
            the title and type, as well as allowing for tiles of segments and relevant
            times."</xsd:documentation>
      </xsd:annotation>
      <xsd:simpleContent>
         <xsd:extension base="xsd:string">
            <xsd:attributeGroup ref="sourceVersionGroup"/>
            <xsd:attributeGroup ref=“titleTypeSourceVersionGroup"/>
            <xsd:attributeGroup ref="segmentTypeSourceVersionGroup"/>
            <xsd:attributeGroup ref="startEndTimeGroup"/>
            <xsd:attribute name="annotation" type="xsd:string"/>
         </xsd:extension>
      </xsd:simpleContent>
   </xsd:complexType>

<xsd:attributeGroup name=“titleTypeSourceVersionGroup">
  <xsd:annotation>
    <xsd:documentation>"This group is similar to sourceVersionGroup but title specific
            and with the addition of a titleType attribute."</xsd:documentation>
  </xsd:annotation>
  <xsd:attribute name=“titleType" type="xsd:string"/>
  <xsd:attribute name=“titleTypeSource" type="xsd:string"/>
  <xsd:attribute name=“titleTypeRef" type="xsd:string"/>
  <xsd:attribute name=“titleTypeVersion" type="xsd:string"/>
  <xsd:attribute name=“titleTypeAnnotation" type="xsd:string"/>
</xsd:attributeGroup>
kvanmalssen commented 10 years ago

That suggestion could work and help maintain backwards compatibility, but we should discuss. It might be somewhat confusing and doesn't support the guideline that says all attributes should refer to the value of the element.

dmaccarn commented 10 years ago

Note: adding "sourceVersionGroup" to title may complicate harmonization with EBUCore. EBUCore points to DC:title which only has a language attribute.

AllisonAnn commented 10 years ago

I like Dave's titleStringType suggestion, since I strongly think that whatever we do - Title and TitleType need to be strictly linked as a set, and be repeatable. I'm not sure that if TitleType becomes an element on its own, it is possible to retain that link, once the Title element is repeated..

I generally have assumed that the SourceRefVersionAnnotation attribute set applies to controlled vocab fields. In that regard, I don't see much purpose, or would use such fields in relation to Title, because Title is generally text based/unique, and not emanating from a controlled vocabulary. I would however, want to have an Annotation/Note attribute (and maybe Source) tied to the Title element, to make any additional notes about the title I think are useful (i.e. It is a published title, it was assigned by the cataloguer (name of person), it was found written on the box, etc.).

Would this work - keep titleType as an attribute of the Title element, but (change the rule) be clear in the best practice documentation that the SourceRefVersionAnnotation set of attributes refers the titleType attribute, and maybe introduce a Note (and Source) attribute for the Title element ...

May Contain: 3 or less optional attributes, specific: titleSource (may be empty) titleNote ( may be empty) titleType ( may be empty) 4 or less optional attributes, specific: (linked to titleType [I would indent/tab this set, but the browser won't render it that way]) source (text, may be empty) ref (text, may be empty) version (text, may be empty) annotation (text, may be empty)

3 or less optional attributes, specific: startTime (text, may be empty) endTime (text, may be empty) timeAnnotation (text, may be empty)

kvanmalssen commented 10 years ago

I have to disagree that we only need the source attribute to describe controlled vocabularies, we also need it to describe authorities, or data that has come from other sources. This is very common. Here is a real life use case that I was working on, which is one of the reasons I am proposing this change:

PBS has a program authority list that the American Archive wanted to append to existing records for national television programming. The title supplied by the station would be retained (with a source of that station's name) and the title from the PBS titles list would be added (with a source of PBS Program Authorities). This could be represented as:

Title: Computers Source: WXYZ

Title: Mr. Fixit's Wide World of Computers Source: PBS Program Authorities

Now, let's say I want to add some episodic information and use title types to differentiate between series and episode titles. This wouldn't be possible in PBCore 2.0 because I need a source for both title and type.

So, what I would like to see in PBCore to support this use case is the following:

<pbcoreTitle>
   <title source="WXYZ">Computers</title>
   <titleType source="PBCore title types">Series</titleType>
</pbcoreTitle>
<pbcoreTitle>
   <title source="PBS Program Authorities">Mr. Fixit's Wide World of Computers</title>
   <titleType source="PBCore title types">Series</titleType>
</pbcoreTitle>
<pbcoreTitle>
   <title source="WXYZ">Motherboards</title>
   <titleType source="PBCore title types">Episode</titleType>
</pbcoreTitle>

The proposal is then to bind title and title type together in an container element, as is done for many other elements in PBCore, and allow sourceVersionGroup attributes for each.

I hope this makes sense.

AllisonAnn commented 10 years ago

Hey - I think this example looks good. I still think though, that the SourceRefVersionAnnotation attribute set is overkill and potentially confusing for the Title element, and that a note attribute (annotation?) suffices. Although, I wouldn't be against a Source attribute too.

In the example where you've linked a value from an authority to the Title element - personally, I decided it was not good to link authority values at the title level. Instead, I linked them at the Subject level, mainly because many items can be linked to more than one program/series over time, and/or they may not be actual published titles (final/finished programs) which aired as part of a given series (think field and studio recordings that can be linked/used in whole or in part, in multiple programs, but are not the actual finished/published/aired programs in and of themselves, or think about photographs which might illustrate or relate to the series or program, but are not actually a part of the published program/series (final product)).

Therefore, I would not link a Series authority value at the Title element to these items, and instead, link the Series authority title(s) at the Subject (Related) area of my database. I enter Series titles in the Title field only for items that are actually published/aired as part of a given series. But, any item can be linked to one or more Series/Program authority value(s) as necessary in the related Subject area of the database.

I adhere to very strict rules about what goes in the Title field, based on AACR2 title type rules, but, not everyone will see it this way of course (unless we specify a best practice for this element).

My main concern however, is that Title and TitleType be strictly linked. It appears from your example that this can be done, which is good.