leondz / garak

LLM vulnerability scanner
https://discord.gg/uVch4puUCs
Apache License 2.0
1.03k stars 121 forks source link

atkgen forking HF tokenizer despite being non-parallel #750

Open leondz opened 1 week ago

leondz commented 1 week ago
probes.atkgen.Tox:   0%|                                                                                                                                                  | 0/5 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...                | 1/10 [00:02<00:23,  2.60s/it]
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
probes.atkgen.Tox:  20%|███████████████████████████▌                                                                                       

This shouldn't happen - atkgen isn't parallelisable, so the attack model (gpt2 here) shouldn't get forked

linking #109