Closed VladimirAlexiev closed 8 years ago
There could be ambiguous dates, for example, when the year value is '00' as year. We are sure if it should be 2000 or 1900. We will not be using xsd:date unless the museum cleans up the data.
00 sounds like empty value. I think you should ignore such values. If you won't convert dates to XSD, put them in P3 not in P82a.
Are you suggesting reordering the date to be in the format year/month/date? Or how do you make something an xsd date?
As per #25 and #26 I will change the data to be the rdfs:label of E61, so the dates themselves will no longer be in P82a. Does this satisfy the issue, or do I also need to reformat the date somehow?
It is going to cause problems if we decide to treat dates as strings. It means that we lose any implicit information contained, and they will become unusable for any date-like things.
I think I fixed this issue. For the 2 date columns, I set semantic type and specified the literal type as "xsd:date". When I look at the rdf, it looks similar to what you said:
"6/6/93"^^<http://www.w3.org/2001/XMLSchema#date>
But this is not a valid xsd:date, should be "1993-06-06". In which property do you put these?
Also: never use E61, that's pointless, just use a literal.
@rhao sorry I didn't read your Sep 20 comment carefully. Yes, xsd:date demands a certain format, you cannot load "6/6/93"^^xsd:date into a proper repo that validates literals.
Okay so the format is yyyy-mm-dd, correct? The problem I have is that the data only contains 2 digits of the year, so if I see something like "06-06-11", I don't know if that should transform to "1911/06/06" or "2011/06/06". Because it seems like it's important to make the date in xsd format, I made the valid years 1917-2016. So if the year is 16, I make it 2016, but if it is 17, I make it 1917. After transforming the date, I set the literal type to xsd:date. This seems like it will cause me to misinterpret some data (for example if something was created in 1916, it will be marked as 2016), so the best solution is for the museum to reexport with the full date data, but for now I have made the dates in this way so at least it is modeled in the correct way.
Most dates are formatted like "month/day/yy". Some of the BeginISODates are formatted "yyyy-mm" or "yyyy-dd" - I can't tell which because the only ones in the dataset could be valid months or days. I assume it's month. So in this case, I will make it "yyyy-mm-01" since P82 is "at some point within", and making it begin on the 1st of the month is the most inclusive. For these rows, the EndISODate is NULL. I will make this empty string "", unless someone has a better suggestion.
"I assume it's month" is correct: yyyy-dd is an unlikely interpretation. Do not fake-complete to "yyyy-mm-01", instead emit it as "yyyy-mm"^^xsd:gYearMonth. Do not emit empty string. If you don't have an end date, the best is to emit P82, instead of P82a/b.
Dates must be valid XSD format. So eg
should become
"1971-06-15"^^xsd:date