LemmyNet / lemmy-ansible

A docker deploy for ansible
GNU Affero General Public License v3.0
248 stars 92 forks source link

[Bug]: Unable to fetch status on some instances (especially, Mastodon) #106

Closed perillamint closed 1 year ago

perillamint commented 1 year ago

Summary

Currently, Lemmy's nginx_internal.conf only redirects requests to lemmy if

However, some AP implementations (Discovered on Mastodon) use multiple mime-types in Accept header. For example,

Accept: application/activity+json, application/ld+json; profile="https://www.w3.org/ns/activitystreams", text/html;q=0.1

and this causes federation problem with Mastodon and other instances.

Steps to Reproduce

  1. Deploy Lemmy instance
  2. Post random status on community
  3. Run curl -v -H 'Accept: application/activity+json, application/ld+json; profile="https://www.w3.org/ns/activitystreams", text/html;q=0.1' <your lemmy status URL here>

Expected behavior: Server returns proper activity+json payload from lemmy Actual behavior: Server returns HTML response from lemmy-ui

Version

BE 0.18.0, commit 63d3759c481ff2d7594d391ae86e881e2aeca56d

Lemmy Instance URL

N/A

ticoombs commented 1 year ago

I cannot replicate this issue on 3 servers that I tested.

They all returned json data. I even tried swapping the accept headers to be application/ld+json, application/activity+json; and it still worked.

So not sure what error you are getting. Can you please double check that you are still getting the error on the initial server you were testing?

Edit: I tested on https://<lemmy.url>/.well-known/nodeinfo endpoint.

perillamint commented 1 year ago

@ticoombs

Can you test it with https://<lemmy.url>/post/<post number>? For me, it responds

$ curl -H 'Accept: application/activity+json, application/ld+json; profile="https://www.w3.org/ns/activitystreams", text/html;q=0.1' https://<my lemmy tunnel url>/post/6    

    <!DOCTYPE html>
    <html lang="en">
    <head>
    <script>window.isoData = {"

and if I modify Accept to application/activity+json

$ curl -H 'Accept: application/activity+json' https://<my lemmy tunnel url>/post/6     
{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    "https://w3id.org/security/v1",
    {
      "lemmy": "https://join-lemmy.org/ns#",
      "litepub": "http://litepub.social/ns#",
      "pt": "https://joinpeertube.org/ns#",

Additional note: https://<my lemmy tunnel url>/.well-known/nodeinfo returns json payload even if it does not have Accept header nor Accept: text/html.

perillamint commented 1 year ago

After digging in the join-lemmy.org, I found some sites have same problem:

Run the following commands to verify

curl -H 'Accept: application/activity+json, application/ld+json; profile="https://www.w3.org/ns/activitystreams", text/html;q=0.1' https://reddthat.com/post/9701
curl -H 'Accept: application/activity+json' https://reddthat.com/post/9701

Also try on following URLs with same Accept header combination. (picked randomly on join-lemmy.org)

https://sh.itjust.works/post/213731
https://sopuli.xyz/post/603934
ticoombs commented 1 year ago

Ah! Yep I can confirm what you are seeing now. :+1:

ross-spencer commented 1 year ago

Thanks for discovering and posting this @perillamint. Seeing the same here on upgrade to 0.18.

jippi commented 1 year ago

I got hit by this one, first time user of lemmy :)

I fixed it by changing nginx.conf to

worker_processes auto;

events {
    worker_connections 1024;
}

http {
    map "$request_method:$http_accept" $the_upstream {
        # If no explicit matches exists below, send traffic to lemmy-ui
        default "http://lemmy-ui";

        # All non-GET requests should go to lemmy
        "~^(?!GET|HEAD).*:.*" "http://lemmy";

        # GET/HEAD for ActivityPub JSON should go to lemmy
        "~^(GET|HEAD):.*?application\/activity\+json.*?" "http://lemmy";

        # GET/HEAD for Linked Data JSON should go to lemmy
        "~^(GET|HEAD):.*?application\/ld\+json.*?" "http://lemmy";
    }

    upstream lemmy {
        # this needs to map to the lemmy (server) docker service hostname
        server "lemmy:8536";
    }

    upstream lemmy-ui {
        # this needs to map to the lemmy-ui docker service hostname
        server "lemmy-ui:8537";
    }

    server {
        listen 8530;

        server_name   localhost;
        server_tokens off;

        gzip on;
        gzip_types text/css application/javascript image/svg+xml;
        gzip_vary on;

        # Upload limit, relevant for pictrs
        client_max_body_size 20M;

        add_header X-Frame-Options SAMEORIGIN;
        add_header X-Content-Type-Options nosniff;
        add_header X-XSS-Protection "1; mode=block";

        # frontend general requests
        location / {
            proxy_pass $the_upstream;

            rewrite ^(.+)/+$ $1 permanent;

            # Send actual client IP upstream
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header Host $host;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }

        # backend
        location ~ ^/(api|pictrs|feeds|nodeinfo|.well-known) {
            proxy_pass "http://lemmy";

            # proxy common stuff
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";

            # Send actual client IP upstream
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header Host $host;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }
    }
}
perillamint commented 1 year ago

@jippi

Much looks better then my hack. Can ypu create new PR and make the reference to this issue? I'll close my PR then.

ross-spencer commented 1 year ago

This is a diff of the current compose and @jippi's change in the mean time. I'm not sure what to do with the port changes, but the new routing config makes sense:

diff --git a/jippi b/jippi
index d5087b7..a7115e9 100644
--- a/jippi
+++ b/jippi
@@ -1,28 +1,43 @@
 worker_processes auto;
+
 events
 {
        worker_connections 1024;
 }
+
 http
 {
+       map "$request_method:$http_accept" $the_upstream
+       {
+               # If no explicit matches exists below, send traffic to lemmy-ui
+               default "http://lemmy-ui";
+
+               # All non-GET requests should go to lemmy
+               "~^(?!GET|HEAD).*:.*" "http://lemmy";
+
+               # GET/HEAD for ActivityPub JSON should go to lemmy
+               "~^(GET|HEAD):.*?application\/activity\+json.*?" "http://lemmy";
+
+               # GET/HEAD for Linked Data JSON should go to lemmy
+               "~^(GET|HEAD):.*?application\/ld\+json.*?" "http://lemmy";
+       }

        upstream lemmy
        {
                # this needs to map to the lemmy (server) docker service hostname
                server "lemmy:8536";
        }
+
        upstream lemmy-ui
        {
                # this needs to map to the lemmy-ui docker service hostname
-               server "lemmy-ui:1234";
+               server "lemmy-ui:8537";
        }

        server
        {
-               # this is the port inside docker, not the public one yet
-               listen 1236;
-               listen 8536;
-               # change if needed, this is facing the public web
+               listen 8530;
+
                server_name localhost;
                server_tokens off;

@@ -40,25 +55,10 @@ http
                # frontend general requests
                location /
                {
-                       # distinguish between ui requests and backend
-                       # don't change lemmy-ui or lemmy here, they refer to the upstream definitions on top
-                       set $proxpass "http://lemmy-ui";
-
-                       if ($http_accept = "application/activity+json")
-                       {
-                               set $proxpass "http://lemmy";
-                       }
-                       if ($http_accept = "application/ld+json; profile=\"https://www.w3.org/ns/activitystreams\"")
-                       {
-                               set $proxpass "http://lemmy";
-                       }
-                       if ($request_method = POST)
-                       {
-                               set $proxpass "http://lemmy";
-                       }
-                       proxy_pass $proxpass;
+                       proxy_pass $the_upstream;

                        rewrite ^(.+)/+$ $1 permanent;
+
                        # Send actual client IP upstream
                        proxy_set_header X-Real-IP $remote_addr;
                        proxy_set_header Host $host;
@@ -69,6 +69,7 @@ http
                location ~ ^/(api|pictrs|feeds|nodeinfo|.well-known)
                {
                        proxy_pass "http://lemmy";
+
                        # proxy common stuff
                        proxy_http_version 1.1;
                        proxy_set_header Upgrade $http_upgrade;
@@ -80,4 +81,4 @@ http
                        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                }
        }
-}
\ No newline at end of file
db0 commented 1 year ago

the internal nginx from @jippi breaks my dev deployment . The main page works, but communities tab is broken https://overctrl.dbzer0.com/communities

Not that my frontend revproxy is a haproxy, not an nginx. But it's a simple setup and it works with the previous internal nginx

dessalines commented 1 year ago

@jippi Could you make a PR for those changes? thanks.

ross-spencer commented 1 year ago

@dessalines given what @db0 points out, do you know if there a sitemap for Lemmy that makes it clearer how the regex could be designed and make it easier to test the different routes?

Nutomic commented 1 year ago

I dont think the problem is related to accept headers, because the mentioned curl command works just fine with lemmy.ml, yet posts cant be fetched from Mastodon. Most likely it is related to a change in the code. So it would be good if an admin of a Mastodon or Hubzilla instance could check the server logs at the time of fetching a Lemmy object, and see what kind of error is being logged.

ross-spencer commented 1 year ago

@Nutomic I can confirm it works with the instance you mention, but it isn't for example this instance which is deployed using Ansible and I believe is affected by this commit: https://github.com/LemmyNet/lemmy-ansible/commit/18c3f3a09f4d532dac7c74642077ff53bfccfa67 in this repo.

curl -v -H 'Accept: application/activity+json, application/ld+json; profile="https://www.w3.org/ns/activitystreams", text/html;q=0.1' https://digipres.cafe/post/27 -A "a-good-agent"

But if Lemmy.ml is deployed using Ansible then, indeed, it sounds helpful to get more diagnostics here.

zamuz commented 1 year ago

On 0.18.0, using @jippi's changes (excluding the port changes), I can subscribe to Kbin communities without issue, before that the status was always "Subscribe Pending".

academician commented 1 year ago

It seems like given commit 18c3f3a09f4d532dac7c74642077ff53bfccfa67, the fix should actually be in nginx_internal.conf?

I made a patch here that should do it, but I'm not making a PR because I haven't tested it: https://github.com/LemmyNet/lemmy-ansible/commit/973428810262bb6f310b808f5c62a71baf04bed1

zamuz commented 1 year ago

It seems like given commit 18c3f3a, the fix should actually be in nginx_internal.conf?

I made a patch here that should do it, but I'm not making a PR because I haven't tested it: 9734288

The fix is indeed for the nginx_internal.conf template, the comments above that mentioned nignx.conf were probably referencing the nginx.conf file inside the proxy container as configured in the docker-compose.yml template.

jeena commented 1 year ago

I made a patch here that should do it, but I'm not making a PR because I haven't tested it: 9734288

It seems that map needs to be directly in http not like you did in location /. Nginx throws an error and doesn't start. Just move it like @jippi has shown it and then it will work.

jippi commented 1 year ago

Hi! Sorry for the delay, been a busy week at $DayJob :)

I've created a PR that should make it work https://github.com/LemmyNet/lemmy-ansible/pull/114

Please give it a good test, I don't use this repository locally, so my config differs slightly in other ways, that shouldn't impact this, but might.