buckket / twtxt

Decentralised, minimalist microblogging service for hackers.
http://twtxt.readthedocs.org/en/stable/
MIT License
1.94k stars 79 forks source link

"User-Agent:" shows up as part of the user agent in logs #147

Open adiabatic opened 5 years ago

adiabatic commented 5 years ago

I see lines like this in my web server's log:

[redacted] - - [22/Sep/2019:19:15:01 +0000] "GET /twtxt.txt HTTP/2.0" 200 22082 "-" "User-Agent: twtxt/1.2.3 (+http://nblade.sdf.org/twtxt/twtxt.txt; @[redacted])"

twtxt is the only client that has User-Agent: in its User-Agent string.

Oddly enough, users who GET my twtxt file from IPv6 addresses don't have User-Agent: in their User-Agent strings. There are at least two users like this and both of them say they're using twtxt/1.2.3.

There are also users connecting from IPv4 addresses that have declined to say who they are on the twtxtiverse. Those User-Agent strings also (properly) lack the duplicated User-Agent:.

I had a look at the current generate_user_agent() and get_remote_tweets() but on a casual inspection I can't figure out what's causing this.

buckket commented 2 years ago

Strange, I just captured an outgoing HTTP request and the User-Agent header definitely does not contain the string "User-Agent".

Maybe your web server is adding this somehow? Or we do have some other clients out there which identify themself as "User-Agent: twtxt/1.2.3 ".