Open mohd-akram opened 3 months ago
Does this need dicussion on Discourse, or is the issue minor enough?
I think that it's a good idea to support formatting in the basic format (that I just discovered).
cc @pganssle @abalkin
Should it be basic
with default value False
or extended
with default value True
?
datetime.isoformat()
has parameter sep
which specifies the separator between date and time. Taking it as a precedence, we can add similar parameters for separators between components in a date and a time. sep
currently can only be a character, it should support also an empty string.
On other hand, adding parameters to .isoformat()
is not the only way to solve this problem. You can also use.strftime()
or str.replace()
.
I am maybe -0.5 on this feature. There is a case for putting stuff in isoformat if people are usually going to want automatic truncation, but the in the use cases put forward like filenames, you would almost certainly prefer a fixed format, so strftime(..., "%Y%M%DT%h%m%s.ext")
seems like it would actually be better than this.
Taking it as a precedence, we can add similar parameters for separators between components in a date and a time. sep currently can only be a character, it should support also an empty string
We should definitely not do this, because ISO8601 makes no provision for arbitrary separators, and to the extent that sep
is even allowed to be something other than T
, I'm fairly confident that you are not allowed to omit it entirely.
First of all, note that my comments are based on ISO 8601:2004 which is superseeded by 8601:2019, which I need to buy (but I won't). I nevertheless assume that the informative parts remain the same (namely sections 1 and 2).
Should it be basic with default value False or extended with default value True?
ISO 8601:2004 section 2.3.3 says The basic format should be avoided in plain text
. For years, isoformat()
assumed the extended format and thus, having a flag for explicitly enabling the basic format is preferrable (basic=True
disables the extended format and explicitly switches to a basic format). With extended=False
, we implicitly switches to the basic format by disabling the extended one.
so
strftime(..., "%Y%M%DT%h%m%s.ext")
seems like it would actually be better than this.
In this case, I would agree but this is not exactly the same as having the basic
format as specified by ISO 8601. Now, while I did suggest a PR for the basic format (and would be happy it was accepted), I'm actually wondering it is really needed in the end. For instance, the date
command does not propose to output the basic format by default but allows to input it, so it could also make sense that we do not want to do it either (you can still output a basic format but you need to make it yourself, e.g., date +'%H%M%S'
).
@pganssle:
There is a case for putting stuff in isoformat if people are usually going to want automatic truncation
What do you mean by automatic truncation? The idea is to add an opt-in format basic=True
, by default nothing is changed. Did I miss something?
What do you mean by automatic truncation?
When timespec
is set to auto
(the default), if a datetime doesn't have sub-second components, they will be excluded from the output; this, and the difference in how time zones are handled, are some of the main reasons why isoformat
isn't just syntactic sugar for some strftime
format:
>>> dts = [datetime(2024, 3, 7, 12, 15, 30, 123456),
datetime(2024, 4, 9, 13),
datetime(2024, 5, 1, 16, 30, 2, 456123, tzinfo=timezone(timedelta(hours=5))),
datetime(2024, 6, 1, 16, 15, tzinfo=timezone(timedelta(hours=5, minutes=3, seconds=14)))]
>>> for dt in dts:
... print(dt.isoformat())
...
2024-03-07T12:15:30.123456
2024-04-09T13:00:00
2024-05-01T16:30:02.456123+05:00
2024-06-01T16:15:00+05:03:14
>>> for dt in dts:
... print(dt.strftime("%Y-%m-%dT%H:%M:%S.%f%z"))
2024-03-07T12:15:30.123456
2024-04-09T13:00:00.000000
2024-05-01T16:30:02.456123+0500
2024-06-01T16:15:00.000000+050314
The main reasons to use .isoformat
is if you want this sort of truncation to happen, or because you prefer the simplicity of "just give me a datetime that complies with this standard". The more we complicate isoformat
, that more it basically becomes strftime
, and it gets bogged down in complexity.
I don't think we should automatically say isoformat
should never change or grow new options, but the reasoning here is not particularly compelling, because it's suggesting an opt-in format with a name that most people won't understand where the primary motivating use case not only can be replaced by an strftime
call, but arguably should be replaced by an stftime
call because:
.fromisoformat
, but only the stftime
version can be parsed by strptime
("oops, this datetime happened to have 0
for the microsecond component and now I need a different parse format!)strftime
, whereas they may not know what isoformat(basic=True)
does, or what corner cases apply.I suppose you could use dt.isoformat(timespec='seconds', basic=True)
to alleviate concerns 1 and 3, but that still leaves concern 2.
How about dt.isoformat(timespec='seconds', short=True)
? That's use case oriented.
short: 20240601T161500.000000
long: 2024-06-01T16:15:00.000000
The term basic is the term in ISO standards and shoud be left as is IMO (if we were to support it).
I agree with @picnixz, the name here is not the problem. If the survey on the API for outputting Z
is any guide, it is really hard to do something unambiguous. basic=True
is almost certainly the best you can do, because it is the standard term for it so it is probably unambiguous, and worst case scenario you can google that term.
That said, almost everyone will have to google that term. I have read ISO 8601 several times, and I implemented two mostly full-featured ISO 8601 parsers, and I had to look up the term to see if it was an official term. No one is going to know what short=True
does without looking it up or reading the docs. basic
is definitely the best term for this, and it will undoubtedly create cognitive load relative to an explicitly specified format.
I think the main blocker here is that there's no compelling use case (and there actually kind of is a compelling use case for #90772, and we still didn't do that one because we couldn't come up with a non-confusing UX for it).
The motivation for the ISO basic format is the same as the extended format - that it is a standardized machine-readable format that ensures seamless interoperability. You do not get that with many potentially subtly incorrect strftime/strptime implementations, as doing it right requires reading and implementing the spec correctly. That machinery is already implemented in Python, and you can also specify your desired granularity with timespec
. Doing this manually would require creating strftime/strptime pairs for each case.
it will undoubtedly create cognitive load relative to an explicitly specified format.
IMO, unless one has the specification table memorized, I don't think "ISO but without - and :" would be more of a cognitive load than figuring out what %Y%M%DT%h%m%s.ext
(which is subtly wrong) does.
Should it be basic with default value False or extended with default value True?
Or format: datetime.Format = datetime.Format.extended
?
Feature or enhancement
Proposal:
In additional to the popular ISO 8601 Extended format, there's also an ISO 8601 Basic format for datetimes which is useful for filenames and URL components as it avoids characters such as eg. colon and is more compact.
datetime.fromisoformat
already supports parsing this format.Example code:
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs