nexcess / magento-turpentine

A Varnish extension for Magento.
GNU General Public License v2.0
520 stars 252 forks source link

Question: Crawler IP Address #1043

Closed addison74 closed 8 years ago

addison74 commented 8 years ago

Here is my test configuration in a virtual machine:

Pound (IP_EXT, 80 and 443) => Varnish (127.0.0.1, 8090) => Apache (IP_EXT, 8080)

By default Crawler IP Address is set to 127.0.0.1. In the above scenario do I have to add Pound or Apache IPs in the list too?

aricwatson commented 8 years ago

I believe you'd want to leave the field blank in this scenario, but certainly be sure to test! See this for more info.

addison74 commented 8 years ago

I will test different scenarios. Now I am using 127.0.0.1, IP_EXT. I will report if the visitors share the same sessions. If there are other things to be watched please let me know.

Update: I got crawler-session in frontend cookie in the above scenario. This is not good. I will test your suggestion with field left blank.

addison74 commented 8 years ago

@aricwatson Could you please let me know why if I set up my computer IP address as "Crawler IP" value in the list, visiting the webserver I do not get crawler-session cookie? But if I set up "Crawler IP" value to webserver IP address I get crawler-session cookie. Is this a bug in Turpentine?

aricwatson commented 8 years ago

Sorry for the late reply, been busy as usual!

Here's the VCL that's actually generated from the Crawler IP and the Crawler User-Agents settings:

if (req.http.Cookie !~ "frontend=" && !req.http.X-Varnish-Esi-Method) {
    if (client.ip ~ crawler_acl || req.http.User-Agent ~ "^(?:ApacheBench/.*|.*Googlebot.*|JoeDog/.*Siege.*|magespeedtest\.com|Nexcessnet_Turpentine/.*)$") {
        set req.http.Cookie = "frontend=crawler-session";
    }
...
}

So, what that means is that the VCL first checks if the request does not include a frontend cookie and is not an ESI request. Then, it checks if the IP is in the list of crawler IPs or if it matches the User Agent list, and sets the crawler-session cookie in that case.

In your scenario, even though your IP in in the crawler list, all your requests will include a frontend cookie if you're browsing the site normally - which is why you don't get the cookie.

But if I set up "Crawler IP" value to webserver IP address I get crawler-session cookie. Is this a bug in Turpentine?

I haven't been able to replicate this behavior.