Closed tgrunzweig-cpacket closed 1 year ago
Hi @Tzahi-cpacket!
Thanks for submitting this issue - our team has been notified and we'll get back to you as soon as we can! In the mean time, feel free to add any relevant information to this issue.
The issue is as you pointed out, that the default date_conversion_func
used by the DFPFileBatcherStage
(https://github.com/nv-morpheus/Morpheus/blob/branch-23.07/docs/source/developer_guide/guides/6_digital_fingerprinting_reference.md#file-batcher-stage-dfpfilebatcherstage) assumes that the creation time of the file is encoded in the filename as an ISO 8601 formatted date string. If that cannot be obtained this way then the method falls-back to attempting to retrieve the modification date from the filesystem (via https://filesystem-spec.readthedocs.io/en/latest/api.html?highlight=modified#fsspec.spec.AbstractFileSystem.modified).
My guess is that this wasn't supported in the versions we currently have in the container, but was added more recent versions.
Is this the relevant environment setup file: https://github.com/nv-morpheus/Morpheus/blob/branch-23.07/docker/conda/environments/cuda11.8_examples.yml ? In there s3fs is rigidly set to 22.08.2, and this is in the latest 23.07 branch. Perhaps needs to be upgraded there?
@Tzahi-cpacket agreed the only question is if it introduces any incompatibilities with other code. I'll take a pass at it.
Version
23.07
Which installation method(s) does this occur on?
Docker
Describe the bug.
Run code similar to dfp_duo_training.ipynb but with input file names that do not have explicit data time, so data_extractror executes the following line in file_utils.py:
getting error:
Minimum reproducible example
Other/Misc.
No response
Code of Conduct