Open dara2 opened 2 years ago
--check
is catching both 7/1/51
and 8/1/70
for me, although it errors out on the first one. Once you correct that data, and rerun --check
, it errors out on the second.
The check for EDTF validity only happens during --check
. Would you prefer that Workbench checks also when it is about to populate the field, and if the date value is not valid, skips populating the Drupal field and logs that it skipped that value? If so, we can do that.
Ah, so perhaps the --check errored out on that first date, and the person running it didn't rerun the check after fixing the first issue? That makes sense. I think then it's just human error, since --check is flagging them. I could go either way on whether Workbench shouldn't ingest the date if it's not valid.
That sounds like what happened, but Workbench should minimize or even eliminate the impact of human error - that's why --check
exists in the first place.
Resolving #268 would make Workbench not error out after the first --check
failure. If that was in place, the user in the case above would have seen that there were more errors in the CSV and probably would not have proceeded to ingest. However, implementing that feature for the entirety of all possible errors found by --check
is a very substantial amount of labour. I think it will be much quicker, and also much safer in terms of not allowing invalid data to get added to Drupal, option would be to do what I describe above: validate CSV values not only during --check
but also during ingest/update and skip populating and log values that don't validate. All of the validation code used during --check
can be reused. Let me look at this option over the weekend to see what's involved.
I'll add, I don't mean that validating CSV data outside of --check
implies that running --check
would become redundant or unnecessary. Even with runtime validation in place, it would still be better to find and fix bad CSV data before committing data to Drupal than to unknowingly let Workbench create 1000 nodes and not populate a field because of simple error in the CSV data for that field.
Workbench --check is not flagging non-EDTF dates and is ingesting the data into EDTF fields (which then display incorrectly on the front-end). An example from our CSV for "Date Created" is
7/1/51
and it goes into the Date Created field as7/1/51
and is displayed on the front-end as7 to 1
. This isn't happening in all cases, oddly. One row of the CSV was flagged as not having an EDTF-formatted date (it flagged8/1/70
as non-EDTF). We are running Drupal 9.3.0, and Workbench was updated within the past 2 weeks.