NHMDenmark / Mass-Digitizer

Common repo for the DaSSCo team
Apache License 2.0
1 stars 0 forks source link

Sort out folder names for dataset exports #350

Closed jlegind closed 1 year ago

jlegind commented 1 year ago

Issue

We need to have a consistent way of naming data export files for easy identification.

Description

There needs to be a simple logical structure to the dataset exports from the Digi App so that post-processing and workbench imports can run smoothly.

Solution

Directory Path: N:\SCI-SNM-DigitalCollections\DaSSCo\Digi App\Data

Suggestion:
After a file comes into the path\Exported_files_from_app and the file is picked up by the data manager, it needs to be transferred into the PostProcessed_openrefine sub-directory to facilitate tracking of the process. It is desirable to have the Exported_from_app_data_files directory almost empty which tells us the datasets that were exported are being post-processed and imported into Specify.

After the post processed dataset is imported into Specify it is moved into the 'Imported_specify' which exists in the 'Path' directory.
When I write files are 'moved' I mean that they are cut and pasted so that files are not replicated all over the directory structure.

jlegind commented 1 year ago

Import protocol has also been updated. https://github.com/NHMDenmark/Mass-Digitizer/blob/main/documentation/import_protocol.md

PipBrewer commented 1 year ago

Finish updating folders with Matilde's corrected spreadsheets - checking they are in Specify, moving to correct folder and deleting unwanted folders

jlegind commented 1 year ago

My immediate solution would be [collection name] like NHMDHerba, then the [date] (yyyymmdd_). Then [hoursminutes] and initials . Example:
NHMD_Herba_20230621_14_05_MG.csv

jlegind commented 1 year ago

Matilde and Chelsea have been contacted.

jlegind commented 1 year ago

Matilde agreed to the proposed format above.

PipBrewer commented 1 year ago

@jlegind There still seems to be unwanted folders in the Data folder on the N Drive (for imports)

jlegind commented 1 year ago

I did a thorough clean up and double checking of the redundant folders and their files. Those folders are gone now.

jlegind commented 1 year ago

There is a 'Capture_one_test' directory which contains many .tif files. I decided to leave this one alone. Also there is a temp folder which I use for checking file structure by saving suspect csv exports as Excel files which yields a quick overview over content and columns. Is there data missing? Do all records have the correct number of columns?