The AllenNLP team have created a Python package called cached-path which can replace the whole of the file_utils.py moudle as that entire module has been mainly been taken from the AllenNLP code base. The benefit of moving to this package is that it is more than likely able to handle more edge cases and any future problems, further it will mean that in this code base we will not have to maintain that one module.
The downside to using this package is that the package requires many dependencies that we currently do not use or need like:
huggingface-hub
google-cloud-storage
boto3
filelock
However in the future we may need all of these requirements, e.g. if we wish to train models with datasets from huggingface-hub
The AllenNLP team have created a Python package called cached-path which can replace the whole of the file_utils.py moudle as that entire module has been mainly been taken from the AllenNLP code base. The benefit of moving to this package is that it is more than likely able to handle more edge cases and any future problems, further it will mean that in this code base we will not have to maintain that one module.
The downside to using this package is that the package requires many dependencies that we currently do not use or need like:
huggingface-hub
google-cloud-storage
boto3
filelock
However in the future we may need all of these requirements, e.g. if we wish to train models with datasets from huggingface-hub