FChannel0 / FChannel-Server

GNU Affero General Public License v3.0
107 stars 13 forks source link

Tor image proxy #38

Open the-sageman opened 2 years ago

the-sageman commented 2 years ago

One of the more annoying aspects of the website is the inability to view images on Tor instances. A server side proxy is easily implementable, but even with the right safeguards (checking that it only accesses valid domains), it is still easily abusable as it acts as a free proxy. The server side proxy could benefit both non-javascript and javascript users, however, for javascript users only, there's a possibility for a client-side Tor proxy, but the library responsible for that still looks like it's still going to be a work-in-progress for at least a few years.

FChannel0 commented 2 years ago

If the server is running a tor proxy which allows it to make http requests, would the client-side proxy just be the client running a tor proxy similar to tor-browser tor directly on their machine? Speaking for the server side I don't think it's wise to cache tor instances media. If a client wants a "seamless" experience when running into tor instances they have to be using a tor proxy themself. Unless there is a way to serve the tor media through a proxy to the client that makes it viewable on clearnet. This again is putting more responsibility on the server owner than one might light. Having the client use a tor proxy leaves the clearnet server free from hosting the physical media. With the client choosing to run a tor proxy they are opting in to see that content instead of doing it for them server side. Maybe I am misunderstanding your point though.

ghost commented 2 years ago

The idea of fetching arbitrary content from tor (even if the server is known, the users are not and never can be due to the nature of tor and imageboards) and caching it on my server to display to the clearnet is just short of terrifying to me as a sysadmin. I do not want to be legally responsible for that. I think the true client side solution here is a browser which is capable of onion routing but doesn't default to it, which is far outside the scope of FChannel. I don't think this is a solvable problem at the level of FChannel without putting administrators at legal risk or dumping loads of JS into the client. I mean look at that thread from kyogi that currently has a gif of Terry Davis at the bottom. I don't know what the images in that thread are but I sure as shit know that I don't want them on a server I'm legally responsible for.

0jsc commented 2 years ago

A server side proxy is easily implementable, but even with the right safeguards (checking that it only accesses valid domains), it is still easily abusable as it acts as a free proxy.

What about checking the referrer? That prevents hotlinking. It's also possible to add something like a CSRF token to make the links one-use. For example, if each link has a HMAC with the current timestamp, links can be limited in validity to something like 30 seconds. If all links have to be signed by the server in that way, this would make it 100% impossible to use to view material that the server didn't specifically suggest.

For example, if the URL looked something like this:

https://imgproxy.123chan.biz/proxy.php?url=asdasd.onion%2Fimg123.jpg&timestamp=1626266096&hash=c987fe818ab69ebf2e6b1bb0b6cce002af62bf9c9b5c7050b947e4d338052052

hash = echo https://imgproxy.123chan.biz/proxy.php\?url\=asdasd.onion%2Fimg123.jpg\&timestamp\=1626266096\&hash\= | openssl dgst -sha256 -hmac "secret"

The server would then check the timestamp and the signature.

As for exposing the exit IP, what about just forcing all outgoing traffic, even clearnet, through Tor? That way, Tor exit nodes are the ones doing the proxying.

Under US and European law, transit providers do not have legal liability. Though it could still be gross.

If this isn't satisfactory, here's a somewhat overengineered mitigation I came up with:

You have several servers involved. The onion service (S1) agrees to serve the content in an encrypted format, the frontend server (S2) participates in the fediverse and has all the metadata, and the dumb middlebox/proxy (S3) just takes an encrypted fetch instruction and returns a result. So:

  1. S2 obtains an image URL from S1
  2. S2 generates a random key
  3. S2 encrypts the random key and the image URL with another random key, then encrypts this second key to S1's public key, then HMACs the whole thing with a shared secret (e.g. diffie hellman between their PK's, this is a solved problem)
  4. S2 presents to the client (ciphertext, hmac, encrypted_key, inner_key)
  5. Client sends to S3 a blob of (ciphertext, hmac, encrypted_key)
  6. S3 sends to S1 this blob
  7. S1 checks HMAC, if OK then responds with an encrypted + hmaced blob (with the encryption key as the code). If not OK, either send an error or send random data
  8. S3 blindly forwards this blob to the client, being 100% oblivious as to the contents
  9. The client checks the MAC to ensure S3 didn't tamper with it, then decrypts it using JS

This way, the proxy has 0 understanding of what's going on. This is basically re-inventing HTTPS, except involving a third party in the key exchange. It'd also be possible to make all these keys one-use, by putting a time limit on the HMAC and then storing all HMACs that were consumed within the time limit in a temporary database (HMACs older than N seconds are always invalid, so the database will never grow large.)