NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
41 stars 27 forks source link

Handling Metacat datasets submission failures #2167

Open helbashandy opened 1 year ago

helbashandy commented 1 year ago

Describe the Issue On the ESS-DIVE project, we found couple of use-cases where a dataset submission's validation would pass on MetacatUI but fails on Metacat. One example is entering special symbol characters in the abstract. Another use case is when due to network issues a data upload fails.

Currently, if uncaught issues happens on the submission, the user could potentially use their entries and then they would have to contact the team to recover the data. We were wondering if there was a way to implement general exception handling to those Metacat errors when they arise, so that the user would be able to correct their entry based on the error.

mbjones commented 1 year ago

@helbashandy Thanks for the report. Could you provide some examples of problematic documents that pass on the client side but fail on insert to Metacat?

Metacat uses the EML project EML validator, which is definitely more robust than what we do on the client side. On the client side, we do not have a full EML validator written in javascript, so simpler field validation is used IIRC. With your examples, we might be able to catch a few more of these, but I would say that character encoding problems in browser editing fields are the worst and very hard to automatically fix (because there is often no correct solution).

vchendrix commented 8 months ago

@mbjones @rushirajnenuji We had another instance this week where a user received an error when they were saving a dataset and the resource map was not created.

The confluence of issues 1) data.ess-dive.lbl.gov was inundated ( and still currently is) with indexing tasks for replication verifications. There are lots of dirtySystemMetadata requests in the apache logs 2) The user received an error in the UI when saving. Their metadata changes where updated but the resource map failed to create. They emailed us with an urgent request to fix due to the fact that they told 150+ people the data was ready 3) We repaired the resource map from the backend by associating the data files from listed in the metadata file which also matched the previous version of the dataset 4) We found a draft dataset record for the metadata file that received the error the told us the doi for the updated dataset could not be found.

EML draft for ess-dive-e976198fe417dbb-20231222T191240371(WHONDRS River Corridor Dissolved Oxygen, Temperature, Sediment Aerobic Respiration, Grain Size, and Water Chemistry from Machine-Learning-Informed Sites across the Contiguous United States (v3)) by Brieanne Forbes. This EML had the following save error: MNodeService.update - The object ess-dive-e976198fe417dbb-20240109T184353161 has been saved successfully on Metacat. However, the new metadata can't be registered on the DOI service: OSTIElinkService.getMetadata - the reponse is blank. So we can't find the identifier 10.15485/1923689, which type is doi ... 5) We manually looked up the doi in OSTI to confirm that is was there. It was. 6) I can only assume that the Metacat was overloaded with system metadata updates and this query failed to return a result due to the load in some way.

This particular error is of no concern to the user and should not have affected the experience saving (since the metadata was updated anyway). We already receive system emails for these errors.

Additionally, in other cases where there is an issue with the EML and the dataset metadata is NOT saved. We also result in a missing resource map.

vchendrix commented 8 months ago

Additional dataset drafts that caused missing resource map

ess-dive-47c069f4c82c6b3-20231115T191156677_draft.txt - this turned out to be an issue with the lat/lon coordinates IIRC. ess-dive-09cbcb9008b436d-20230727T234116374_draft.txt - prevoius identifier was already obsoleted. We have many drafts with this particular error ess-dive-b81dcf32b33472c-20230525T155438019_draft.txt - this dataset had the doi issue. However, it is expected to not be found in OSTI because we do not own this particular DOI. We do own others with this prefix ess-dive-e8676db66730d28-20230420T220927232_draft.txt - This one had some bad cut and paste xml in the abstract. <https: www.caee.utexas.edu="" setx-uifl="">. As part of this project, we will be developing climate-resilient design solutions for areas of the region.</https:>

vchendrix commented 8 months ago

Also note that any upload errors that are encountered will cause a missing resource map as well.

mbjones commented 8 months ago

@rushirajnenuji @robyn This is seeming related to issue #1318 and #1586 as well.

In both cases, it seems like a failure to save or validate a document leads to a corrupted/missing resource map.

The character encoding problems are still tough to deal with, as discussed above, but we should be able to be better at detecting the validation error on the save to metacat and not lose content when that happens. Our error handling pipeline in MetacatUI seems to miss that metacat produces an error and silently moves on. This has been a common thread and involves data loss, so I am going to label this as critical.

vchendrix commented 8 months ago

Also note that any upload errors that are encountered will cause a missing resource map as well.

2134 is a similar issue but equally important in that is can result in a missing resourceMap

mburrus commented 7 months ago

We encountered a data corruption use case on February 6th, 2024, around 9:35am PST. This is another example related to #2134 where a failed file upload prevented the dataset metadata from submitting, in this case the file upload timed out and crashed rather than failing with an error message. The user was creating a new dataset and since the upload crashed before the submit button was hit, we don't even have a draft eml file. The user had to start over from scratch without support.

More details on the steps the user took here:

I created the dataset today and attempted to upload files, but they never popped up with a [green checkmark or any] status (i.e. uploading, uploaded, etc.). Once I realized that they were not loading, I tried to submit my data to save the other entered information. Then the portal stalled and the whole submission crashed. Then, I started over again. I did fill out some of the metadata before uploading the files (primarily the title and abstract), but I only hit the submit button after I realized the files were not uploading. The only warning or message was at the bottom above the submit data button letting me know that the submission is in progress.

mburrus commented 2 months ago

Another use case from 06/24/2024. Resource map broke and dataset could not be indexed because special characters were used in dataset metadata.

This user requested to publish a dataset and the ESS-DIVE team requested they make changes to meet our quality checks. When the user revised their dataset, they put some kind of special character in "Step 7" of the methods. The user did not report this to us. We noticed this issue on Jun 28th because our Jira automation (which creates new publication request notifications and updates existing publication requests with new PID versions) was broken; we were not being notified of new publication requests.

Solution: Create new dataset version. Converted special characters to utf-8 character set, but we can't know what should be written there. User will have to review and correct the characters. This solution also caused some duplication of requests on Jira that we had to correct. @vchendrix can provide more details on this use case.

Broken dataset version: https://data.ess-dive.lbl.gov/view/ess-dive-3619bd077a60b7c-20240624T120319367 Fixed dataset version: https://data.ess-dive.lbl.gov/view/ess-dive-cd700a6203f8bad-20240708T143132157482

vchendrix commented 2 months ago

Another use case from 06/24/2024. Resource map broke and dataset could not be indexed because special characters were used in dataset metadata.

Thanks @mburrus for capturing more details here. I logged this particular issue in #2481 as well

vchendrix commented 3 weeks ago

Logging another issue on wfsi-data.org that we encountered which may have been due to the user saving in quick succession.

Error scenario

Message from user on 9/10/2024

How do I edit the dataset? I created it, uploaded 4 point cloud files, saved and stepped away, and now am unable to get back into it

Error message on save

From the look of their history (See User's edit history below) they were not editing metadata but just trying to upload files. It seems like they were saving in between.

EML draft for wfsi-20240910T004051181-dea693a41e54b3f(Fuels data for 2019 Closing Gaps Sycan Nature Preserve research burn Unit 1_C_forest) by Sarah Flanary. This EML had the following save error: The previous identifier has already been made obsolete by: wfsi-20240910T004121322-ff8fc344276c691

Metadata wfsi-20240910T004121322-ff8fc344276c691 mentioned in the error was missing a resource map which tells me that maybe they were trying to save again before the previous dataset version had been indexed. This may have caused the previous identifier to stay in memory instead of the one mentioned in the error. They kept trying to upload files and save which kept on resulting in the same error.

User's edit history

<!-- save error 4 -->
<doc>
            <str name="id">wfsi-20240910T004725372-cf9f57a473147b7</str>
            <str name="fileName">eml_draft_Flanary.txt</str>
            <str name="formatId">text/plain</str>
            <long name="size">7953</long>
            <date name="dateUploaded">2024-09-10T00:47:25.541Z</date>
        </doc>
<!-- Upload and save error 3 -->
        <doc>
            <str name="id">wfsi-20240910T004724697-dfdbf8be3846281</str>
            <str name="fileName">eml_draft_Flanary.txt</str>
            <str name="formatId">text/plain</str>
            <long name="size">7953</long>
            <date name="dateUploaded">2024-09-10T00:47:24.902Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T004701546-f0ac034951449dd</str>
            <str name="fileName">GCP_locations.csv</str>
            <str name="formatId">text/csv</str>
            <long name="size">428</long>
            <date name="dateUploaded">2024-09-10T00:47:03.072Z</date>
        </doc>
<!-- save error 2 -->
        <doc>
            <str name="id">wfsi-20240910T004203134-3d871fdcd433e9b</str>
            <str name="fileName">eml_draft_Flanary.txt</str>
            <str name="formatId">text/plain</str>
            <long name="size">7806</long>
            <date name="dateUploaded">2024-09-10T00:42:03.300Z</date>
        </doc>
<!-- Upload and save error 1 -->
        <doc>
            <str name="id">wfsi-20240910T004202614-38133400041596b</str>
            <str name="fileName">eml_draft_Flanary.txt</str>
            <str name="formatId">text/plain</str>
            <long name="size">7806</long>
            <date name="dateUploaded">2024-09-10T00:42:02.789Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T004131325-abdd4d5227c40eb</str>
            <str name="fileName">fuel_cover_and_loadings.csv</str>
            <str name="formatId">text/csv</str>
            <long name="size">9588</long>
            <date name="dateUploaded">2024-09-10T00:41:32.442Z</date>
        </doc>
<!-- Missing resource map -->
        <doc>
            <str name="id">wfsi-20240910T004121322-ff8fc344276c691</str>
            <str name="fileName">Fuels_data_for_2019_Closing_Gaps_Sycan_Nature.xml</str>
            <str name="formatId">https://eml.ecoinformatics.org/eml-2.2.0</str>
            <long name="size">7349</long>
            <date name="dateUploaded">2024-09-10T00:41:21.857Z</date>
        </doc>
<!-- Last successful save -->
        <doc>
            <str name="id">wfsi-20240910T004051165-e2680b6f2b52db2</str>
            <str name="fileName">wfsi_20240910T004051165_e2680b6f2b52db2.rdf.xml</str>
            <str name="formatId">http://www.openarchives.org/ore/terms</str>
            <long name="size">4849</long>
            <date name="dateUploaded">2024-09-10T00:40:52.887Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T004051181-dea693a41e54b3f</str>
            <str name="fileName">Fuels_data_for_2019_Closing_Gaps_Sycan_Nature.xml</str>
            <str name="formatId">https://eml.ecoinformatics.org/eml-2.2.0</str>
            <long name="size">7349</long>
            <date name="dateUploaded">2024-09-10T00:40:51.756Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T004039869-bf24e8d6cfd5be1</str>
            <str name="fileName">wfsi_20240910T004039869_bf24e8d6cfd5be1.rdf.xml</str>
            <str name="formatId">http://www.openarchives.org/ore/terms</str>
            <long name="size">4849</long>
            <date name="dateUploaded">2024-09-10T00:40:42.105Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T004039885-bd5beaee9ce8563</str>
            <str name="fileName">Fuels_data_for_2019_Closing_Gaps_Sycan_Nature.xml</str>
            <str name="formatId">https://eml.ecoinformatics.org/eml-2.2.0</str>
            <long name="size">7349</long>
            <date name="dateUploaded">2024-09-10T00:40:40.938Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T000146916-b1624fc4efee242</str>
            <str name="fileName">outer_E_postburn.e57</str>
            <str name="formatId">application/octet-stream</str>
            <long name="size">246889472</long>
            <date name="dateUploaded">2024-09-10T00:29:53.078Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T000147049-d2085315f3f422f</str>
            <str name="fileName">outer_W_postburn.e57</str>
            <str name="formatId">application/octet-stream</str>
            <long name="size">245432320</long>
            <date name="dateUploaded">2024-09-10T00:29:50.933Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T000147044-6ea30ade6b69ee5</str>
            <str name="fileName">outer_S_postburn.e57</str>
            <str name="formatId">application/octet-stream</str>
            <long name="size">244446208</long>
            <date name="dateUploaded">2024-09-10T00:29:47.040Z</date>
        </doc>
        <doc>
            <str name="id">wfsi-20240910T000146982-d04e1f0045e41b8</str>
            <str name="fileName">outer_N_postburn.e57</str>
            <str name="formatId">application/octet-stream</str>
            <long name="size">243701760</long>
            <date name="dateUploaded">2024-09-10T00:29:37.369Z</date>
        </doc>

Thoughts on potential measures prevent this from happening

Is it possible to prevent a dataset from being saved again until the previous save has been fully indexed? Maybe you can determine if the resource map there? If not, prevent uploading and saving?