Closed posixeleni closed 9 years ago
@ekraffmiller I have put in the fixes for the issues found in the error log you submitted to me. This wont need a new schema.xml but a db drop would be needed to see the new values in build for @kcondon to test.
I did encounter two types of errors that I could not immediately resolve with the tsv and need you and @scolapasta advice:
Import Exception processing file worldhistorical/122057.xml, msg:Error parsing datasetVersion: Value 'English and Dutch' does not exist in type 'language'
Import Exception processing file stwalter/117416-1.xml, msg:Error parsing datasetVersion: Value 'http://www.europeansocialsurvey.org' does not exist in type 'PSRI8'
Import Exception processing file PSReplication/122026-1.xml, msg:Error parsing datasetVersion: Value 'www.aiddata.org' does not exist in type 'PSRI8'
Import Exception processing file PSReplication/92777-1.xml, msg:Error parsing datasetVersion: Value 'www.isaidno.de' does not exist in type 'PSRI8'
Import Exception processing file SOCRATESJOURNAL/117782.xml, msg:Error parsing datasetVersion: Value 'http://www.socratesjournal.com/index.php/socrates/article/view/5' does not exist in type 'PSRI8'
Import Exception processing file SOCRATESJOURNAL/117782-1.xml, msg:Error parsing datasetVersion: Value 'http://www.socratesjournal.com' does not exist in type 'PSRI8'
Import Exception processing file SOCRATESJOURNAL/117782-2.xml, msg:Error parsing datasetVersion: Value 'http://www.socratesjournal.com/index.php/socrates/article/view/5' does not exist in type 'PSRI8'
For the English and Dutch one, we added this to the pre scrub: update studyfieldvalue set strvalue='English' where metadata_id=273999 and studyfield_id=218 and strValue='English and Dutch'; insert into studyfieldvalue (strvalue, metadata_id, studyfield_id, displayorder) values ('Dutch', 273999,218,1);
@posixeleni
For the 2nd question; not sure.
They clearly can't be part of the controlled vocab. Is there some other existing field that these can be parsed to? If so, we can do a prescrub where we set that value and then delete these?
@scolapasta for me to better answer that question is there any way you or @ekraffmiller can give me the dataset IDs for these problematic datasets in Q#2? Once I go in to the actual datasets to see what they are doing it might make it easier to resolve this issue.
The db id is in the error as the number (before the dash) of the xml file. You can use that id and the version number (which is the number after the dash) on the study page in production, like so:
http://thedata.harvard.edu/dvn/faces/study/StudyPage.xhtml?studyId=92777&versionNumber=1
This one for example is a draft. (I assume they all might be)
@scolapasta @ekraffmiller after looking at the datasets it became evident that they renamed/renumbered this field in 3.6 so all of the PSRI8 ones that have an error above should be mapped to PSRI3 which already exists in the custom metadata block and is a free-text field. Is this possible?
@ekraffmiller @scolapasta Sorry about the confusion this week over this one issue with PSRI8. I have gone in and made sure all the fields mapped to the correct ones (especially the YES/NO/NA). Here is the updated spreadsheet with the correct mapping: https://docs.google.com/spreadsheets/d/1rZo3QugzmYifpo518QgvvpIOb14Z8IM-5dWumzG5FME/edit?usp=sharing
Please let me know if you need me to help with anything else.
@ekraffmiller I updated the csv file and checked it into github so let me know if I can help with anything else. Fingers crossed the error log looks MUCH better this time.
Ready for QA to test with a re-migration to see if any errors come up.
Had to add one more gsdCoordinatorName to the customGSD.tsgv block
Import Exception processing file GSD_Test_2/91588-1.xml, msg:Error parsing datasetVersion: Value 'Etzler, Danielle' does not exist in type 'gsdCoordinator'
Fixed more missing controlled vocabulary values added for customGSD.tsv
'Abalos, Inaki' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93054-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/92203.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93053-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/91635-1.xml, msg:Error parsing datasetVersion: Value 'Long, Judith' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93049-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93056-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/92212.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/92212-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/92203-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93289.xml, msg:Error parsing datasetVersion: Value 'Maltzan, Michael' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93050-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93271.xml, msg:Error parsing datasetVersion: Value 'Maltzan, Michael' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93052-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/91640.xml, msg:Error parsing datasetVersion: Value 'Bandy, Vincent' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93147.xml, msg:Error parsing datasetVersion: Value 'VanDerSys, Keith' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93148.xml, msg:Error parsing datasetVersion: Value 'VanDerSys, Keith' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93276.xml, msg:Error parsing datasetVersion: Value 'Hansch, Inessa' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93260.xml, msg:Error parsing datasetVersion: Value 'Hansch, Inessa' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93277.xml, msg:Error parsing datasetVersion: Value 'Hansch, Inessa' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93256.xml, msg:Error parsing datasetVersion: Value 'Maltzan, Michael' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/91622-1.xml, msg:Error parsing datasetVersion: Value 'Curtis, Lawrence' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/93041-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93051-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93055-1.xml, msg:Error parsing datasetVersion: Value 'Other' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93149-1.xml, msg:Error parsing datasetVersion: Value 'VanDerSys, Keith' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/91645-1.xml, msg:Error parsing datasetVersion: Value '01402: Parallel Motion: Walden Pond, Concord / Central Park, New York' does not exist in type 'gsdCourseName'
Import Exception processing file gsd/93255.xml, msg:Error parsing datasetVersion: Value 'Hansch, Inessa' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/91643-1.xml, msg:Error parsing datasetVersion: Value '01403: After La Villette' does not exist in type 'gsdCourseName'
Import Exception processing file gsd/91699-1.xml, msg:Error parsing datasetVersion: Value '01404: California Limnolarium (experiments in projective processes)' does not exist in type 'gsdCourseName'
Import Exception processing file gsd/93225.xml, msg:Error parsing datasetVersion: Value 'Maltzan, Michael' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/91732-1.xml, msg:Error parsing datasetVersion: Value 'O'Donnell, Sheila' does not exist in type 'gsdFacultyName'
Import Exception processing file gsd/91694-1.xml, msg:Error parsing datasetVersion: Value 'Wu, Cameron' does not exist in type 'gsdCoordinator'
Import Exception processing file gsd/93146.xml, msg:Error parsing datasetVersion: Value 'VanDerSys, Keith' does not exist in type 'gsdFacultyName'
Added another missing value to the customGSD.tsv file based on @ekraffmiller error log.
Import Exception processing file gsd/91732-1.xml, msg:Error parsing datasetVersion: Value 'Tuomey, John' does not exist in type ‘gsdFacultyName'
@kcondon to test this in build I would need a db drop as well
No further errors have been reported. I am closing this ticket.
Uncovered errors during a recent migration test by @ekraffmiller that I will need to fix in the following custom metadata blocks:
GSD Block
PSRI Block
ARCS Block