nomad-coe / nomad

NOMAD lets you manage and share your materials science data in a way that makes it truly useful to you, your group, and the community.
https://nomad-lab.eu
Apache License 2.0
64 stars 14 forks source link

Error when deleting entry #41

Open ericpre opened 1 year ago

ericpre commented 1 year ago

I have setup a nomad-oasis server with a symbolic link of the ./volumes folder to a separate folder. It works fine to upload the data and I see the data being created in the right place, but there are errors when deleting the entry from the "your existing upload" section:

at first attempt:

Process delete_upload failed: OSError: [Errno 39] Directory not empty: 'archive'

at second attempt:

Process delete_upload failed: OSError: [Errno 16] Device or resource busy: '.nfs000000010fa5315e00000001'

image

markus1978 commented 1 year ago

It looks like there are some files not created by NOMAD (i.e. by the nomad user, linux uid 1000). When NOMAD is trying to delete the directory it is either not removing the extra files or not allowed to remove the extra files. From the file name, I would assume it is your nfs implementation creating some extra files in the upload folders or the nfs is during operation on the file, or something like this.

Could you check the owning user id and rights on this file '.nfs000000010fa5315e00000001' for us. This might help to find a solution. You have to imagine that NOMAD acts as a nomal user with id 1000 that tries to "rm -r" a directory.

ericpre commented 1 year ago

Thanks @markus1978 for the quick reply, please find below the information for this file:

-rw-r--r--. 1 localadmin localadmin 184907 Feb 14 14:52 .nfs000000010fa5315e00000001

The UID of localadmin is 1000.

After restarting the container, the .nfsxxxx... disappeared and I was able to delete the entry. There are two things which are difference from the "standard" configuration, i.e. following https://nomad-lab.eu/prod/v1/staging/docs/oasis.html#quick-start. There are things which are different here:

I have tried to change the path of the .volumes folder in the docker-compose.yaml file to point directly to the folder (and not use the symlink) and there is the same error and a .nfsxxxx file is created.

markus1978 commented 1 year ago

I classify this as a bug for now. From what you are saying, NOMAD should been able to delete the file itself and consequently should be able to delete the folder. And even if not, NOMAD should expect these situations, because we want to enable clients to integrate the NOMAD directories into existing storage solutions like you are doing it.

markus1978 commented 1 year ago

@mohammadnakhaee Can you have a look at this, please. You could experiment with externally created extra "secret" files (starting with .) in the .volumes upload folders. It is more likely that this happens with such files in the upload folder or the upload archive folder. Just try if you can reproduce.

ericpre commented 1 year ago

For completeness, the full path is:

/oasis-data/.volumes/fs/staging/ZX/ZXm1pqqmQ1eewxe6HCB7vw/archive/.nfs000000010638e2dc00000007

The file name is different because it is from a different upload/delete test. This file seems to be created when attempting to delete the data entry.

It deletes the upload successfully when following these steps:

Could it be that there is something that keep a file open (I can see only a *.msg file in the archive folder) while it shouldn't and causes the error?

mohammadnakhaee commented 1 year ago

I could reproduce it by changing the attribute of an extra file sudo chattr +i .test