lllyasviel / stable-diffusion-webui-forge

GNU Affero General Public License v3.0
8.17k stars 804 forks source link

Any chance we can get the auto opt-out of huggingface telemetry? #1014

Open 311-code opened 2 months ago

311-code commented 2 months ago

I never know if I'm actually doing this right tbh... But as per this post I would love to see some environmental variables added to the webui.bat that automatically protect us forge fans from telemetry being sent to huggingface-hub (in venv/lib/site-packages or c:\users[yourusername].cache\huggingface) or any other telemetry that sdforge becomes aware of that could be automatically opted out of. Such as in transformers/utils/hub.py and hf_api.py.

Telemetry was an issue in Comfy-cli recently which was sending telemetry data by default. They made a ton of removals And now it filters properly and seems to better protect non-technical users. I know telemetry can be necessary to gather feedback for developers, but should not happen if there's any chance (even accidentally) of sensitive data like strings, tokens, images, embeddings, or training dataset information without the user knowing.

It seems OP from reddit misread a line of code about prompt_trackingconsent for comfy-cli, but I still don't think their post should have been removed as there are legitimate security concerns. But I know I'll definitely be renaming all of my datasets with an in front of everything now before training just in case ;)

Huggingface recently changed TRANSFORMERS_OFFLINE=1 so I duplicate it a bunch of times in activate.bat in comfyui/venv/scripts/ for example because I don't know exactly where to put it. But anyway here is the opt-out code that could be added if you like:

REM Disable Huggingface Telemetry
set "HF_HUB_DISABLE_TELEMETRY=1"

REM Set Huggingface Transformers to Offline mode 
set "TRANSFORMERS_OFFLINE=1"

Not sure if this next one would cause any issues but I believe it allows from_pretrained to use local_files_only=True universally now I read, it was recently added.

REM Set Huggingface Offline mode
set "HF_HUB_OFFLINE=1"

It would be great if we could completely uninstall huggingface-hub it will not break sdforge like it does with comfyui. Still completely working with local models in offline mode. sd1_clip.py seems to be a problem in comfyui when attempting this, and a few other areas.

Another alternative... maybe on any upload requests outside of localhost the outgoing must be approved in the console by pressing y? We can do that with a firewall but for the sake of convenience? I tried to implement this into aiohttp for comfyui for some post and requests stuff I didn't like but it broke many things of course.

rktvr commented 2 months ago

having it on by default and no way to opt out is awful. anyone know what exactly is being transmitted? i assume prompts/images generated (or info thereof)?

311-code commented 2 months ago

The comfy-cli was on by default but not everyone uses that repo, and if you were lucky it gave a prompt during install and you could say no. I said no when it asked me in the past.

For Huggingface they give the option to disable the telemetry by setting the environmental variables above. The first link explains pretty well what it appears to send. I also agree the hugginggace code could use some better filtering for potentially sensitive data. I did find it interesting that they have trained on albert models on imdb. Kind of want to download it and see whats actually in it.

I think the huggingface people maybe don't fully know what some people are adding to their code sometimes.

It could be when someone decides to add a bunch of "telemetry improvement" features in PR randomly and it looks like an upgrade but can sacrifice user privacy and go unnoticed. I've always found it sort of odd when someone does that. (Depending on how confusing it looks)

lunar-studio commented 2 months ago

Agreed. But could someone track down their IPs and block them via firewall? I didn’t even realize they had done this.

311-code commented 2 months ago

Agreed. But could someone track down their IPs and block them via firewall? I didn’t even realize they had done this.

Yes, you can get the outgoing ip addresses background apps are targeting on your network using WFN repo It will prompt with a popup when something is trying to send outbound or inbound, and has detailed logging. The 2.6 version seems to work for windows 11, but overall it's a very buggy app.

After getting it going though you can set blocks in windows firewall under advanced settings by right clicking inbound and outbound and create a rule. Then go to custom, all programs, choose the protocol or select any, then specify the ip and port to block. netstat -ano in cmd prompt is also useful for seeing the app ids and current connections. Make sure to backup the current firewall rules first before blocking random services or deleting random rules.

Overall I'd recommend Linux Mint though if you care about telemetry and want something that sort of feels like Windows (plus the VRAM benefits). Then set up similar firewall restrictions with the incoming and outgoing connections prompting you.

There are so many areas of Windows that expose user data it very difficult to manage and a lot of is tied to pretty essential services in services.msc. For example if you try to block certain ports in svchost.exe from connecting to the internet windows firewall gives warnings about "windows-service hardening rules" overriding it, which likely means Windows is still going to allow outbound or inbound traffic for svdhost.exe for whatever is wrapped in that service protocol/port you just blocked (that Microsoft deems essential...) gpedit.msc can mitigate a lot of it but not natively available for home edition and it's all still very time consuming.

Edit: forgot to mention, also useful commands in poweshell or cmd as admin:

netstat -ano | findstr :(enter questionable port)

tasklist /FI "PID eq <PID>" replace PID with the suspect apps pid shown from netstat -ano to show it's associated .exe

taskkill /PID <PID> /F to kill the task, make sure to block it or ports also.