Open sffc opened 11 months ago
Want to discuss with:
It should be noted that the ::format
function has an error only in DateTimeFormatter, not in TypedDateTimeFormatter, and the only error it returns is a mismatched calendar. So, any additional pre-processing we do would result in changing the signature of the ::format
functions on TypedDateTimeFormatter, which we could do in 2.0. However, it might be nice to avoid this.
I'd like to proceed with the following solution:
DateTimeNames::with_pattern
to two functions:
DateTimeNames::try_with_pattern
, which checks that the names are loaded correctly for the patternDateTimeNames::with_pattern_unchecked
, which has the current behavior (the function is unchecked but not unsafe)load_for_pattern
and include_for_pattern
the way they are.debug_assert!(false)
with the errorcore::fmt::Error
, which only happens without debug assertions, and which is silent only if logging is disabledDateTimeFormatter::format
is a mismatched calendar error. We could even change the error type to be specific for that case, but I don't se a good reason to do that.OK?
sgtm
Using core::fmt::Error
means code like formatted_date_time.to_string()
panics, so I don't think we should ever use it. We should GIGO, in debug mode we debug-assert, but in release mode we return Ok
and write something like <missing xy-data>
.
(edit: this is actually a requirement of the trait)
Document that the only error in DateTimeFormatter::format is a mismatched calendar error. We could even change the error type to be specific for that case, but I don't se a good reason to do that.
This is why I don't like the error enums. We have so many methods that can only error with a subset of variants, but the client always has to check all of them. Documentation is not a strong enough guarantee imho.
Using
core::fmt::Error
means code likeformatted_date_time.to_string()
panics, so I don't think we should ever use it. We should GIGO, in debug mode we debug-assert, but in release mode we returnOk
and write something like<missing xy-data>
.
Ok how about writing the string "(icu4x error)"? I don't want to carry around the Display impl for the Error type in release mode, and you get the whole error with debug assertions enabled.
Document that the only error in DateTimeFormatter::format is a mismatched calendar error. We could even change the error type to be specific for that case, but I don't se a good reason to do that.
This is why I don't like the error enums. We have so many methods that can only error with a subset of variants, but the client always has to check all of them. Documentation is not a strong enough guarantee imho.
I mostly agree. Not proposing to change anything right now.
We should use a different garbage string for each code location. (icu4x error)
is not actionable.
The formatters share the same code path.
The best we could do without an enum would be something like (error in icu::datetime::TypedDateTimeFormatter)
The GIGO doesn't have to happen at the top level that you linked. write_pattern
should be infallible, and then write_field
would do garbage out instead of returning errors: https://github.com/unicode-org/icu4x/blob/main/components/datetime/src/format/datetime.rs#L239. This lets us use different messages for different fields.
Hmm, replumbing the formating code is a fairly big change
My main concern with the Display impl is code size. We could avoid this by using a different error struct internally that carries a static str with the additional context
Or are you proposing that we produce outputs such as "(icu4x error: weekday names not loaded), 6 (icu4x error: month names not loaded) 2024" (instead of "Saturday, 6 January 2024")
Yes, that
IMO "{?}, 6 {?} 2024"
is more user friendly than "(icu4x error: weekday names not loaded), 6 (icu4x error: month names not loaded) 2024"
What are your criteria for "user friendly"? {?}
certainly looks nicer, but this is an error, it shouldn't look nice. We also want to communicate which names are not loaded; a bunch of {?}
are going to be harder to fix, making it less user friendly.
I think the user persona in your mind is a developer, and in my mind, end user of a product that utilizes ICU4X. Both personas are important and they should receive output adapted to their needs.
but this is an error, it shouldn't look nice.
The developer seeing an erroneous output should be able to act on it, I agree. Your sample serves that well.
The end user of a web browser cannot act on the error, and should see broken output in context of their UI. I believe my sample serves that better. The end user may also get a text error in their debug console, or some telemetry may send error data to developer for debugging, but the UI impact should be low.
The use case for this message is DateTimeNames::with_pattern_unchecked
. The unchecked
part clearly communicates to the developer that they have to be careful with this and verify the output before any user ever sees it. This is why I'm primarily thinking about communicating to the developer instead of an end-user.
I realized that this is basically the same problem space as "GMT+?" (#2245)
Along those lines, we could make up pithy error strings for each field that are both user friendly and searchable. For the month field:
Wdyt?
Discussed in meeting to a partial conclusion
@robertbastian is it guaranteed that we have a number when we don't have patterns/names/whatever?
@manishearth only cases where DTF might not have a value from the date is cyclic/related iso. totally fine to ? in those cases, like time zones
@robertbastian: original idea was
@robertbastian: shorter is "¿M01? 6, 2024". less clear it's broken
@zbraniecki: Do not want prod users seeing ICU4X error. it's meaningless. Ok with ¿? or brackets or whatever.
Ideas:
"(M01) 6, 2024"
"{M01} 6, 2024"
"[M01] 6, 2024"
"
Agreed (amongst @zbraniecki, @robertbastian, @manishearth):
- No scary strings in release/prod mode.
- Fine to have scary string in debug mode
- Shane can pick whatever brackets he wants
@sffc is this okay with you (and if so, please remove the discuss label)
I'd prefer to not include the month number because the error-state code path should be as minimal as possible; formatting "M01" is more complicated than "M" (even if only a little bit more complicated) and I want to start from the position of being as stripped-down as possible.
I am happy with 💥M
or 〚M〛
or ⦃M⦄
or «M»
or {M}
@zbraniecki How strongly would you like the month number to be rendered?
Discussion with @zbraniecki:
«M01»
, «gregory»
«E3»
«am»
«pm»
For numeric fields if the FixedDecimalFormatter is not loaded, load the FixedDecimalFormatter in no-data mode, but still indicate the error using brackets
Why not use Writeable for u8
?
Also I just realised
- Fine to have scary string in debug mode
We're debug_assert!
ing, so debug mode will panic anyway.
For numeric fields if the FixedDecimalFormatter is not loaded, load the FixedDecimalFormatter in no-data mode, but still indicate the error using brackets
Why not use
Writeable for u8
?
Option<FixedDecimalFormatter>
to FixedDecimalFormatter
that defaults to one that performs the desired error-like outputimpl Writeable for u8
has some code size, and it is not an impl that is necessarily going to be present in the binary (https://github.com/unicode-org/icu4x/blob/main/utils/writeable/src/impls.rs)The remaining work is semver breaking, so moving this to 2.0.
Neo uses the new error types. @robertbastian to investigate if this is done.
Currently there are 3 places where errors can be returned in datetime formatting:
DateTimeFormatter::try_new
DateTimeFormatter::format
FormattedDateTime::write_to
Note that case 3 is a
core::fmt::Error
whereas cases 1 and 2 areDateTimeError
. Any errors that occur duringwrite_to
are logged withlog::warn!
and then flattened to acore::fmt::Error
.Currently case 3 does not happen in normal public API usage of the icu::datetime crate. However, it is now possible to reach this case in DateTimePatternInterpolator (#3347).
If we moved this error to phase 2, we may end up slowing down the path for all users since we need to walk the whole pattern in order to figure out if the data is missing.
We should decide on this before shipping DateTimePatternInterpolator.