Closed mbjones closed 6 years ago
Original Redmine Comment Author Name: Saurabh Garg (Saurabh Garg) Original Date: 2005-12-08T18:14:37Z
Moving to 1.7 as not enough time to get this done for 1.6 deadline.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-03-17T05:55:37Z
Added nodedatadate column to the following tables: xml_nodes, xml_nodes_revisions, xml_path_index. Included DB update statements for postgres and oracle. Included Java upgrade utility (and mechanism for doing this in future releases) that converts date and dateTime nodedata values and inserts them into the nodedatadate column. Formats supported for insert/searching are two common ISO 8601: yyyy-MM-dd yyyy-MM-ddThh:mm:ss
Note that yyyy is supported as a numerical comparison already.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-03-23T23:50:36Z
I'd like to expand the date formats we can accept to include all valid xs:dateTime formats (ISO 8601). I found I could use javax.xml.bind.DatatypeConverter.parseDateTime(String) to do all the heavy lifting - the only problem is that it will parse some pretty far-out dates and consider them valid rather than throwing an exception: For example this random numeric value: "1297467700008" is parsed as: "196973390-06-18 09:34:08.384000-0800 BC" 200 million years ago, indeed. When I try to insert this date in Postgres (and likely other DBs) it complains about being out of range: java.sql.SQLException: ERROR: timestamp out of range: "196973390-06-18 09:34:08.384000-0800 BC" And it is, indeed, out of the range that postgres accommodates. I am hesitant to add much date-range checking to the insert code for fear that it will be inappropriately draconian and/or a performance hit. But if I were to check the date, what range would be appropriate? I'd say it should be the intersection of postgres' and oracle's capabilities even though this can change from version to version and can be changed depending on the storage mechanism you configure if you're an advanced DB admin. Still it's a starting place. Alternatively, we could attempt numeric parsing first, in which case this non-delimited ISO format would be considered simple numeric data rather than a date. We'd still get range-based query operations, but it would be numeric rather than date-based.
Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2011-03-24T03:00:59Z
I agree that DatatypeConverter.parseDateTime(String) is a great way to go, ans supporting as much of the ISO 8601 as possible is best -- that's what we claim for EML, so it would match. If the DB can't handle the dates, it will throw an exception as you indicate and we can fall back to treating that particular value as a string (treating it as a number is probably not that useful). It seems to me that you have a good approach worked out.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-03-24T16:25:16Z
My concern is with catching the exception from the database level in that it will be a performance hit. But here's the plan:
So in the case that I previously laid out we would have three prepared statements executing for this one xml node (insert, a failed update, and a successful update). But in most cases we would have only the insert because ISO spec is still pretty constrained and numeric parsing is quite strict. We'd still be catching Java ParseExceptions, but hopefully only a small number of SQLExceptions which are much more costly.
*I want to insert the node value as only a string initially so that the SQLException does not prevent the row from being written if/when there is a DB exception raised for a parsed date value.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-04-06T18:39:27Z
this is now in trunk and includes a utility method that runs during DB upgrade.
running it on dev I ran into OutOfMemory errors while upgrading - need to keep an eye on this and how the servers are configured WRT memory allocation.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-10-26T23:30:12Z
This should be in 2.0.0 since it is already in the trunk. The upgrade script is the only thing I am worried about -- but chunking up the modifications should resolve it.
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2011-10-27T20:01:26Z
Will exercise this during testing for release.
Original Redmine Comment Author Name: gastil gastil (gastil gastil) Original Date: 2012-06-06T19:50:02Z
Invalid dates are allowed into Metacat 2.0.0 if the eml is 2.0.1. I thought those were supposed to be caught.
If you want to test this, an example doc to use is http://metacat.lternet.edu/knb/metacat/knb-lter-fce.515/lter which is un-patched eml 2.0.1 or http://lava.lternet.edu/knb/metacat/knb-lter-bug.515.1/lter which is patched eml 2.0.1 or https://demo2.test.dataone.org/knb/metacat/knb-lter-bug.515.1/default
This eml doc has
Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2012-09-05T16:52:34Z
Unfortunately, those values are "schema valid"
(In reply to comment #9)
Invalid dates are allowed into Metacat 2.0.0 if the eml is 2.0.1. I thought those were supposed to be caught.
If you want to test this, an example doc to use is http://metacat.lternet.edu/knb/metacat/knb-lter-fce.515/lter which is un-patched eml 2.0.1 or http://lava.lternet.edu/knb/metacat/knb-lter-bug.515.1/lter which is patched eml 2.0.1 or https://demo2.test.dataone.org/knb/metacat/knb-lter-bug.515.1/default
This eml doc has
25569 36891
Original Redmine Comment Author Name: Redmine Admin (Redmine Admin) Original Date: 2013-03-27T21:19:05Z
Original Bugzilla ID was 2084
Author Name: Duane Costa (Duane Costa) Original Redmine Issue: 2084, https://projects.ecoinformatics.org/ecoinfo/issues/2084 Original Date: 2005-05-20 Original Assignee: ben leinfelder
Metacat pathquery relational search modes ("greater-than", "less-than", etc.) do not currently support temporal searches on date fields. The reasons for this are described in the email correspondence to metacat-dev below. This enhancement would make it possible to do temporal searches using date ranges, which would be a important feature in an "Advanced Search" form (such as the one currently under development at LTER), and could also be added to the search dialog in Morpho.
On 5/17/2005, Duane Costa wrote:
Metacat supports the following pathquery search modes: contains, starts-with, ends-with, equals, isnot-equal, greater-than, less-than, greater-than-equals, less-than-equals.
For the search modes that are equivalent to relational operators (equals, isnot-equal, greater-than, less-than, greater-than-equals, less-than-equals), is it possible to use these search modes in EML fields that contain non-numeric string values? In particular, is it possible to use the relational search modes for date strings?
For example, here is a pathquery that attempts to find all documents with temporal coverage between January 1, 1900 and January 1, 2005. It reads like this: “Return all documents that have a beginDate or a singleDateTime greater than or equal to 1900-01-01, and an endDate or a singleDateTime less than or equal to 2005-01-01.”