arcjet / arcjet-js

Arcjet JS SDKs. Rate limiting, bot protection, email verification & attack defense for Node.js, Next.js, Bun & SvelteKit.
https://arcjet.com
Apache License 2.0
241 stars 5 forks source link

Improve Fly.io client IP detection with Node / Express #1754

Open davidmytton opened 5 days ago

davidmytton commented 5 days ago

On Fly.io, the client IP address in req.ip provided by Express.js is always an internal IP address e.g. ::ffff:172.16.21.10. This causes the request to be rejected by Arcjet when in production mode.

To pass the correct IP, we have to construct a custom request object e.g:

   const flyIp = req.get("Fly-Client-IP") || req.ip;

    // Construct an object with Arcjet request details
    const details = {
        ip: flyIp,
        method: req.method,
        host: req.hostname,
        url: req.path,
        headers: req.headers,
    };

    const decision = await aj.protect(details, {
        requested: 1
    });

The correct IP can then be detected by Arcjet. I couldn't use the https://github.com/arcjet/arcjet-js/tree/main/ip package because req.headers is not the expected format:

file:///Users/david/Code/rmx/node_modules/@arcjet/ip/index.js:501
    const xClientIP = headers.get("x-client-ip");
                              ^

TypeError: headers.get is not a function
    at findIP (file:///Users/david/Code/rmx/node_modules/@arcjet/ip/index.js:501:31)
    at file:///Users/david/Code/rmx/server.js:28:22
    at Layer.handle [as handle_request] (/Users/david/Code/rmx/node_modules/express/lib/router/layer.js:95:5)
    at trim_prefix (/Users/david/Code/rmx/node_modules/express/lib/router/index.js:328:13)
    at /Users/david/Code/rmx/node_modules/express/lib/router/index.js:286:9
    at Function.process_params (/Users/david/Code/rmx/node_modules/express/lib/router/index.js:346:12)
    at next (/Users/david/Code/rmx/node_modules/express/lib/router/index.js:280:10)
    at SendStream.error (/Users/david/Code/rmx/node_modules/serve-static/index.js:121:7)
    at SendStream.emit (node:events:519:28)
    at SendStream.error (/Users/david/Code/rmx/node_modules/send/index.js:270:17)

Node.js v20.17.0
blaine-arcjet commented 4 days ago

If the req.ip is not a global IP, more lookups are attempted, which would reach the fly platform check at https://github.com/arcjet/arcjet-js/blob/main/ip/index.ts#L634

blaine-arcjet commented 4 days ago

I couldn't use the https://github.com/arcjet/arcjet-js/tree/main/ip package because req.headers is not the expected format:

You need to build ArcjetHeaders with the @arcjet/headers package, e.g. new ArcjetHeaders(req.headers), and pass those to the @arcjet/ip package.

davidmytton commented 4 days ago

This case is unusual because req.ip is a bogon and there is no platform header. Fly-Client-IP is not set when the request is coming from a health check, so Arcjet errors because the IP is not a public IP.

There's an argument to say that you shouldn't be using Arcjet against health checks, but I think we'll see this with a lot of monitoring & deployment systems.

The current behavior seems correct to me - if we can't determine the public IP then we have to error (unless you can think of a way around it?). So the question is: what is our recommended workaround if you want to have Arcjet run on every endpoint (including the health check)? Bogon check + looking at the user agent?

blaine-arcjet commented 3 days ago

So the question is: what is our recommended workaround if you want to have Arcjet run on every endpoint (including the health check)?

Why does there need to be a workaround? If decision.isErrored() it is also decision.isAllowed(), so the health check will just be allowed. Since this seems that we wouldn't block the health check, is the only problem here that the user is seeing an error log?

Something else related to this: if we can't build characteristics, we should probably just avoid sending to the service at all—and the workaround for that would be to use a different characteristic.

davidmytton commented 3 days ago

Yeah the errors in the logs are disconcerting, and could be verbose if they're doing regular health checks from multiple locations. This is more about how we improve the DX around this scenario rather than it being a change to how our core functionality works i.e. what's our official recommendation that would go in the docs?

blaine-arcjet commented 2 days ago

I think:

  1. We need to write up some clear docs about "ERROR" decisions, such that you can't treat an error as a block and actually need to do manual validation in the code (e.g. look for no user-agent header) if you want to block errors.
  2. If we fail to build a fingerprint from characteristics, we should immediately return an error decision saying the fingerprint could not be built. We'll also document that you should always build your fingerprint from valid sources—e.g. if you have traffic from non-public IPs, then you should not use IP as a fingerprint.

Do you think anything else needs to be done?

davidmytton commented 2 days ago

Agreed on both. I've added https://github.com/arcjet/arcjet-docs/issues/96 for the docs.