gdcc / easyDataverse

🪐 - Lightweight Dataverse interface in Python to upload, download and update datasets found in Dataverse installations.
MIT License
15 stars 4 forks source link

folder structure and description of added files not being preserved #13

Open cristobaltapia opened 1 year ago

cristobaltapia commented 1 year ago

Hi,

I was trying to upload a bunch of files with descriptions directly with easyDataverse but I am having a little issue. Although the files are uploaded, the information corresponding to the folder structure and file description is lost. Does this happen to anyone else, or is it just me?

Cheers

kbrueckmann commented 2 weeks ago

I'm not one of the developers, but as far as I know the default description is an empty string. So if you update a file that previously did have a description without overwriting the default that description will be lost. Regarding the folder structure: I had no problems here; I just noticed that the files in folders won't get the categories and descriptions I specified if I add them as directories. As I didn't see any parameters to change that behaviour, I iterate over the data like this and use the add_file()-method:

    for dir_path, dir_names, file_names in os.walk(path):
        for file_name in file_names:
            dataset.add_file(
                         local_path=dir_path + "/" + file_name,
                         categories=categories,
                         dv_dir= "" if dir_path == f"{path}" else re.sub(f"{path}/", "", dir_path, 1),
                         description=description) 

The weird regex is just there to remove the folder in which I store all the data from the path.