This is similar to #559, I am running setfit in a container and the exec starts in a location that is not writeable but the current user. This results in a PermissionError at runtime.
I am able to replicate this locally using the example even if output_dir is set in TrainingArguments by chowning the execdir to another user.
bad_dir % python3 example.py
Using the latest cached version of the dataset since sst2 couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /Users/bluestealth/.cache/huggingface/datasets/sst2/default/0.0.0/8d51e7e4887a4caaa95b3fbebbf53c0490b58bbb (last modified on Tue Oct 1 18:57:42 2024).
/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1617: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be deprecated in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
warnings.warn(
model_head.pkl not found on HuggingFace Hub, initialising classification head with random weights. You should TRAIN this model on a downstream task to use it for predictions and inference.
Applying column mapping to the training dataset
Applying column mapping to the evaluation dataset
Traceback (most recent call last):
File "/Users/bluestealth/testing-setfit/bad_dir/example.py", line 27, in <module>
trainer = Trainer(
^^^^^^^^
File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/setfit/trainer.py", line 328, in __init__
self.st_trainer = BCSentenceTransformersTrainer(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/setfit/trainer.py", line 48, in __init__
super().__init__(model=setfit_model.model_body, **kwargs)
File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/sentence_transformers/trainer.py", line 201, in __init__
super().__init__(
File "/Users/bluestealth/testing-setfit/.env/lib/python3.12/site-packages/transformers/trainer.py", line 611, in __init__
os.makedirs(self.args.output_dir, exist_ok=True)
File "<frozen os>", line 225, in makedirs
PermissionError: [Errno 13] Permission denied: 'tmp_trainer'
This is because before settings the arguments passed in super.__init__() is called.
Since no TrainingArgs are passed in, it default to output_dir being "tmp_trainer" in the sentence transformer trainer. Then, when sentence transformers calls super.__init__() the transformers trainer tries to create the output_dir causing the error above.
v1.1.0
This is similar to #559, I am running setfit in a container and the exec starts in a location that is not writeable but the current user. This results in a
PermissionError
at runtime.I am able to replicate this locally using the example even if
output_dir
is set inTrainingArguments
by chowning the execdir to another user.This is because before settings the arguments passed in
super.__init__()
is called. Since no TrainingArgs are passed in, it default tooutput_dir
being "tmp_trainer" in the sentence transformer trainer. Then, when sentence transformers callssuper.__init__()
the transformers trainer tries to create theoutput_dir
causing the error above.