kuleuven / mango-ingest

BSD 3-Clause "New" or "Revised" License
6 stars 0 forks source link

Does the tool wait for large files to be complete before ingesting? #5

Open toby-verwimp opened 1 day ago

toby-verwimp commented 1 day ago

We are running tests 24/7 and every 10 min high-frequency data is acquired, with the size of the largest file being maximally 100 MB. This data is exported from the default format of the DAQ system to MATLAB data files (.mat) into a dedicated directory. For such large files, it can take up to a minute (or more) to export them. This results in the destination file already being created and gradually being filled (as we can see in the Windows File Explorer).

This being said, we made in the past our own tool to automatically upload this data to a collection in ManGO. However, we have noticed that the files are also ingested even if the file was not complete yet.

Therefore, my question is: does the tool developed here wait for large files to be complete before ingesting them to the collections in ManGO?

Thanks a lot!

paulborgermans commented 23 hours ago

It depends :-)

When a file is being created or modified, the file will initially be registered in a queue and only when the "modified" timestamp is sufficiently old, the file will be considered "at rest" and uploaded to ManGO/iRODS)

On linux like filesystems and when using "native" for the --observer option , a "file closed" event is sent and captured when a file is being closed after writing. In this case, the tool does not wait to upload but starts it immediately