ykkunkels / ESIR_portal

Shiny app for the ESM Item Repository
2 stars 0 forks source link

Missing columns in complete Repository dataset for items 450 - 492 #20

Closed OliviaKirtley closed 2 months ago

OliviaKirtley commented 5 months ago

When downloading the full set of items from the Repository, items 450 - 492 (Schneider - University of Geneva) have missing data in the following columns:

The issue originally occurred because the relevant columns were hidden in the completed item submission template for these items.

In Summer 2022, I updated the item submission template for these items (on OSF). I just checked again and in the file on the OSF, these columns are visible and data are present for the following columns:

Note: These items were submitted with the first version of the item submission template, so column order differs from current order in Repository and current template.

BenjaminKunc commented 3 months ago

Do we have an older version of this dataset? I checked the file in the Repository (Uploaded to the Repository folder) and these columns are missing: Remarks, Which datasets does this item appear in, Contact person for each dataset + Email.

If the columns were there in some previous versions, then it would be best to add the columns back into the file. If the items were submitted like this, then I suppose it would be best to contact the researchers if they would be willing to provide these missing information. Another option would be to create empty columns with the same names (this is necessary for executing Laura's script on formatting the dataset into older template structure) and keep them empty for now.

OliviaKirtley commented 3 months ago

Hi Ben, Have you checked the relevant submission file on the OSF to ensure these columns are visible there (I checked before, but probably worth someone else checking). I don't think those columns have ever been visible in the main Repository dataset, so my sense is that fixing this will involve adding the missing columns from the individual submission .csv into the main Repository dataset again.

BenjaminKunc commented 3 months ago

I might have overlooked something but it seems to me that the original submission file does not include these columns. That is, not only that the information are missing, but it seems that these columns have been deleted from the particular submission template at some point.

The columns dataset and contact are both in the main Repository file and the new submission template (which was used in this case), just under different names. However, the submission template with the items from Schneider has neither of these columns. I also tried unhiding columns in Excel and they still seem to be missing.

I checked both the current and older version of this dataset, yet I couldn't find these columns.

ykkunkels commented 3 months ago

Hi Ben and Olivia,

All current and previous versions of the data can be found on OSF (https://osf.io/5ba2c). You can click on the counterclockwise arrow at the right-side of the screen to see all 23 versions.

I just scanned it quickly, and in the newer versions (from version 20 onward) there is no "Remarks" column, but in older versions (version 19 and before) there is. At this point the format of the data was revised, droping columns and adjusting column-names.

BenjaminKunc commented 3 months ago

Hi Yoram,

I checked the submitted items once again and I found out that I indeed overlooked the columns including information on dataset name and contact info.

I added the missing information into the main Repository file and uploaded it as a new version to OSF.