Open lizkrznarich opened 1 month ago
The CSVs have 502612
assertion ids to remove, however 28820
of those ids where already removed in the previous cleanup processes, so we ended up having 473792
records to remove form the assertions table.
done - snapshot created delete-invalid-assertions-5256114
Following on from work done on https://github.com/datacite/corpus-data-file/issues/18 and https://github.com/datacite/corpus-data-file/pull/37 to validate accession numbers and generate reports with validation results, remove assertions with accession numbers that have been identified by the product team as invalid per the spec in https://docs.google.com/document/d/1SaIU_HUFIMQN-PrYjZ0JJXuYBydWcYOLaQs6dJ1uW9c/edit .
Files with lists of assertions to remove are located in https://drive.google.com/drive/u/3/folders/19T2ICvsDUYk4VA9dbDagsWp5VGIAo7Vd