samtools / htsjdk

A Java API for high-throughput sequencing data (HTS) formats.
http://samtools.github.io/htsjdk/
283 stars 242 forks source link

Allow space to separate date from time in Iso8601Date #1217

Open dariober opened 5 years ago

dariober commented 5 years ago

I wanted to point out that the parsing of date-time strings by samtools.util.Iso8601Date may be overly stringent since Iso8601Date doesn't allow a space as a separator between date and time. I think the space separator is quite common so it may be useful to support it?

Steps to reproduce

@Test
public void testTseparator(){
    new Iso8601Date("2000-01-01T12:00:00Z");
} // OK

@Test
public void testSpaceSeparator(){
    new Iso8601Date("2000-01-01 12:00:00Z");
}
htsjdk.samtools.util.DateParser$InvalidDateException: [For input string: "01 12"] is not an integer
    at htsjdk.samtools.util.DateParser.getCalendar(DateParser.java:230)
    at htsjdk.samtools.util.DateParser.parse(DateParser.java:245)
...

To support the space separator one could edit the DateParser class in samtools/util/DateParser.java from:

StringTokenizer st = new StringTokenizer(isodate, "-T:.+Z", true);

to:

StringTokenizer st = new StringTokenizer(isodate.trim().replace(" ", "T"), "-T :.+Z", true);

(I can send a pull request with tests if useful)

Environment

lbergelson commented 5 years ago

@dariober Thanks for the details. I'm honestly not super familiar with the Iso8601Date class or specification. ( I've been hoping we could replace this whole set of classes with more modern ones, but it's never been a high priority. https://github.com/samtools/htsjdk/issues/561.)

Is a space allowed in the spec? From a quick glance at wikipedia it looks like T is the valid separator.

If you're running into this as a practical problem in the wild I think we could accept a PR that allows this to accept slightly non-conforming dates and times.