HTTPArchive / legacy.httparchive.org

<<THIS REPOSITORY IS DEPRECATED>> The HTTP Archive provides information about website performance such as # of HTTP requests, use of gzip, and amount of JavaScript. This information is recorded over time revealing trends in how the Internet is performing. Built using Open Source software, the code and data are available to everyone allowing researchers large and small to work from a common base.
https://legacy.httparchive.org
Other
328 stars 84 forks source link

Create custom metric: robots.js #262

Closed jroakes closed 2 years ago

jroakes commented 2 years ago

This custom metric parses raw, rendered, headers, and iframe data for valid user-agent and directive values.

Example Output:

"robots": {
    "mainFrameRobotsRendered": {
        "robots": {
            "noindex": false,
            "index": true,
            "follow": true,
            "none": false,
            "nofollow": false,
            "noarchive": false,
            "nosnippet": false,
            "unavailable_after": false,
            "max-snippet": true,
            "max-image-preview": true,
            "max-video-preview": true,
            "notranslate": false,
            "noimageindex": false,
            "nocache": false,
            "indexifembedded": false
        }
    },
    "mainFrameRobotsRaw": {
        "robots": {
            "noindex": false,
            "index": true,
            "follow": true,
            "none": false,
            "nofollow": false,
            "noarchive": false,
            "nosnippet": false,
            "unavailable_after": false,
            "max-snippet": true,
            "max-image-preview": true,
            "max-video-preview": true,
            "notranslate": false,
            "noimageindex": false,
            "nocache": false,
            "indexifembedded": false
        }
    },
    "mainFrameRobotsHeaders": [],
    "iFrameRobotsRaw": [],
    "iFrameRobotsHeaders": []
},

This has NOT been thoroughly tested, but throwing up here due to the limited time.

jroakes commented 2 years ago

Another test. This includes iFrame rollup

"robots": {
                    "mainFrameRobotsRendered": {
                        "googlebot": {
                            "noindex": false,
                            "index": false,
                            "follow": false,
                            "none": false,
                            "nofollow": false,
                            "noarchive": false,
                            "nosnippet": true,
                            "unavailable_after": false,
                            "max-snippet": false,
                            "max-image-preview": false,
                            "max-video-preview": false,
                            "notranslate": false,
                            "noimageindex": false,
                            "nocache": false,
                            "indexifembedded": false
                        }
                    },
                    "mainFrameRobotsRaw": {
                        "googlebot": {
                            "noindex": false,
                            "index": false,
                            "follow": false,
                            "none": false,
                            "nofollow": false,
                            "noarchive": false,
                            "nosnippet": true,
                            "unavailable_after": false,
                            "max-snippet": false,
                            "max-image-preview": false,
                            "max-video-preview": false,
                            "notranslate": false,
                            "noimageindex": false,
                            "nocache": false,
                            "indexifembedded": false
                        }
                    },
                    "mainFrameRobotsHeaders": [],
                    "iFrameRobotsRaw": {
                        "robots": {
                            "noindex": 3,
                            "index": 0,
                            "follow": 0,
                            "none": 0,
                            "nofollow": 0,
                            "noarchive": 0,
                            "nosnippet": 0,
                            "unavailable_after": 0,
                            "max-snippet": 0,
                            "max-image-preview": 0,
                            "max-video-preview": 0,
                            "notranslate": 0,
                            "noimageindex": 0,
                            "nocache": 0,
                            "indexifembedded": 0
                        }
                    },
                    "iFrameRobotsHeaders": []
                },
jroakes commented 2 years ago

I am closing this and will reopen on the custom metrics project.