thegreenwebfoundation / co2.js

An npm module for accessing the green web API, and estimating the carbon emissions from using digital services
Other
399 stars 51 forks source link

Return verbose response for green hosting checks #189

Closed fershad closed 5 months ago

fershad commented 8 months ago

Is your feature request related to a problem? Please describe. Currently, running a greencheck using CO2.js returns either:

Describe the solution you'd like It would be nice to have the option to get back a more verbose response from the Greencheck API, including any public evidence that is available for a green provider.

Describe alternatives you've considered A developer could perform this fetch request themselves, but it would be nice to have it available within the library.

Additional context For example, checking "google.com" returns:

https://api.thegreenwebfoundation.org/api/v3/greencheck/google.com

{
  "url": "google.com",
  "hosted_by": "Google Inc.",
  "hosted_by_website": "https://www.google.com",
  "partner": null,
  "green": true,
  "hosted_by_id": 595,
  "modified": "2024-02-02T04:27:28",
  "supporting_documents": [
    {
      "id": 108,
      "title": "Sustainability at Google",
      "link": "https://sustainability.google"
    },
    {
      "id": 139,
      "title": "Independent verification of Google 2020 Reporting",
      "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/Google_Cloud_-_3degrees_cloud_services_review_statement_final.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240202%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240202T054248Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=1976a86d90604c11260b17420839a2c92b1dcdfba49b867dc23436ece5585fbf"
    },
    {
      "id": 140,
      "title": "2021 Environmental Report",
      "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/google-2021-environmental-report.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240202%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240202T054248Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=c97f46d056156a53201743603de3d863688f3c9aedf57e91ee29d94fe7171c9e"
    },
    {
      "id": 141,
      "title": "2022 Environmental Report",
      "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/google-2022-environmental-report.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240202%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240202T054248Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=a7346f2bdd14f7a18ee87a9524c4bb5b4c8fbeb1f6847dbac2439fdab2e71185"
    },
    {
      "id": 142,
      "title": "Independent verification of Google 2021 Reporting",
      "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/alphabet-fy2021-environmental-indicators-assurance-letter.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240202%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240202T054248Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=965f57864647a0313f2e0f75a6c0394b6e8441f61e962b2bebdce22ced830ee1"
    }
  ]
}
sfishel18 commented 8 months ago

@fershad i'd like to take this one if it's available. would this make sense as an approach?

mrchrisadams commented 8 months ago

hey @fershad - would you consider having this as part of an options object we pass into the the existing check API?

const { hosting } = require("@tgwf/co2"); // or probably the es6 import syntax now that's in 2024

// define what we check. We pass in single domain, or the list of domains
const domains = ["google.com", "facebook.com", "twitter.com"];

// set the options for how we request stuff
const options = {
  userAgentIdentifier: "myExampleApplication 0.1",
  verbose: true
}

results = await hosting.check(domains,  options)

It's pretty common to use an options object like this in js, and under the hood, you might still use checkVerbose to fetch the data.

Doing it this way would save us needing to have a second, similar method exposed to end users (we have currently have one)

https://developers.thegreenwebfoundation.org/co2js/tutorials/check-hosting/

It would also introduce a way to handle further tweaks to the checks we make, if we need further options in future.

Note, this does imply us changing how we support user agent headers, which is a pain, but it would make future modifications to API requests sent from co2.js easier to manage.

Also, one thing about checking multiple domains:

for checking multiple domains, checkVerbose returns a dictionary of each domain mapped to its response payload

We currently don't support this server side, so this would likely need to make a bunch of separate requests in parallel to return the dictionary keyed by domain. Also, at present the multi check only returns 'green' results, so the only clue you have that a domain didn't show as green its absence in the results. I actually prefer @sfishel18's approach, and the current multi check is on the older v2 endpoint - see /v2/greencheckmulti/{url_list}.

Would it be possible to break that part out into a different issue to discuss separately from the single domain verbose check being implemented?

Even if the first version is implemented client side in CO2.js, it feels like it would make more sense to support a verbose multidomain check on the server side, where you'd be able to fetch all the data in a single database query to send along in a payload, rather than lots of individual ones network requests.

Even if they're all done in parallel by the client, I suspect it would still likely be slower than a single query, and be similar good issue to pick up for contributors there too.

sfishel18 commented 8 months ago

for checking multiple domains, checkVerbose returns a dictionary of each domain mapped to its response payload

We currently don't support this server side, so this would likely need to make a bunch of separate requests in parallel to return the dictionary keyed by domain

@mrchrisadams could you give me a little background on this? the reason i proposed that response structure is that in the local testing i did, the API appears to already returns things that way:

--> curl -X GET https://api.thegreenwebfoundation.org/v2/greencheckmulti/%5B%22google.com%22%2C%22example.com%22%5D | jq .

{
  "google.com": {
    "url": "google.com",
    "hosted_by": "Google Inc.",
    "hosted_by_website": "https://www.google.com",
    "partner": null,
    "green": true,
    "hosted_by_id": 595,
    "modified": "2024-02-09T02:55:42.344849",
    "supporting_documents": [
      {
        "id": 108,
        "title": "Sustainability at Google",
        "link": "https://sustainability.google"
      },
      {
        "id": 139,
        "title": "Independent verification of Google 2020 Reporting",
        "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/Google_Cloud_-_3degrees_cloud_services_review_statement_final.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240209%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240209T025542Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=1942fcf1fa4778c0d3ec7255dcfe9b7550d173f49c858e4374e606016729f5e4"
      },
      {
        "id": 140,
        "title": "2021 Environmental Report",
        "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/google-2021-environmental-report.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240209%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240209T025542Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=2641dd66faeccadc32176ab3d305c97dc916b42384745ba5485e16538ec097e8"
      },
      {
        "id": 141,
        "title": "2022 Environmental Report",
        "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/google-2022-environmental-report.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240209%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240209T025542Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=a52b0146fdf84fa1646e471cae7726886cd5999b567d4c4bbf54ab5fc9f96944"
      },
      {
        "id": 142,
        "title": "Independent verification of Google 2021 Reporting",
        "link": "https://s3.nl-ams.scw.cloud/tgwf-web-app-live/uploads/alphabet-fy2021-environmental-indicators-assurance-letter.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=SCWT1WBAW6NZ5SW5GYJ8%2F20240209%2Fnl-ams%2Fs3%2Faws4_request&X-Amz-Date=20240209T025542Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=7eb09257dce0b6a7907f51216ed927d8f6bb8b77a2da36978cbf436c81284780"
      }
    ]
  },
  "example.com": {
    "url": "example.com",
    "hosted_by": null,
    "hosted_by_website": null,
    "partner": null,
    "green": false,
    "hosted_by_id": null,
    "modified": "2024-02-09T02:55:42.568323"
  }
}
mrchrisadams commented 8 months ago

hi @sfishel18 - lol, please disregard what I said in the previous comment.

You're absolutely right about this API response returning richer information already from the backend - I was thinking about the old behaviour of the multi check API endpoint that would only return the green domain matches, not the extra information. I totally forgot I had added this support for extra evidences in the backend around three years ago!

I agree that having this extra information exposed in the library would be really helpful - if the data is already being sent over the wire, I think it would definitely make sense to provide easier access to it from a method call in the library.

We'd also need to update the [developer-docs] to make sure they reflect this new reality. I've created an issue to track this when this update is landed.

sfishel18 commented 7 months ago

@mrchrisadams think of it as a gift from your past self :)

so if i understand correctly the preferred approach would be to introduce an options argument to the existing check functions as opposed to adding a net-new checkVerbose for users to have to be aware of. do i have that right?

mrchrisadams commented 7 months ago

heh, @sfishel18 - I'll take it - thank you, past me! :D

I've had a bit of a look through what accepting an options parameter object would actually entail, and what alternatives exist now, and I'm now less sure about requesting the options parameter object, especially in light of ES6's default parameter features, which in many ways do the same job, natively. I think the options parameter object thing was a pattern used as a workaround before default parameters were widely available.

I'd also like to leave the decision with @fershad when it comes to making changes to the check() method signature, because we literally changed the check() method signature in the last release 😅

That said, I think we can decouple the decision to add support for verbose checks in the check() method via passing another parameter into the function, from having an useful checkVerbose method implemented in the library.

Based on @fershad's initial issue created, I'd say your initial proposal is sound:

  • add a new checkVerbose function to both hosting-node.js and hosting-api.js
  • for checking a single domain, checkVerbose returns the entire payload from the api.thegreenwebfoundation.org endpoint
  • for checking multiple domains, checkVerbose returns a dictionary of each domain mapped to its response payload

I'll create a separate issue about supporting a verbose flag/option in the check() method, as there's a bit more to discuss there, and I now realise it can be addressed independently of this issue if you were to add the checkVerbose() methods as you initially outlined.

fershad commented 7 months ago

Guys, cheers for the chat about this. Sorry it's taken me so long to get around this one.

@sfishel18 thank you for the PR. I'm commenting here so that my thoughts are captured in this issue before getting to a code review in the PR.

After reading the conversation above, and thinking about this one a bit more I'm leaning towards @mrchrisadams initial idea of having an options object passed into the check() function. This is because:

On that last point, the only tool I'm aware of that's picked it up is Sitespeed. By default, their tool uses a local database for greenchecks & users need to opt-in to checking against the API. That limits the potential impact of our change further.

We can also take this opportunity to update the hosting-node.js``check() function so that it's the same as hosting-api.js check() function. That would mean the path to the JSON database file is also passed into that check() function through the options object. This would mean our code is more consistent across both files.

@sfishel18 do you want me to start a code review in the PR, or do you want to go ahead and make changes to it? I can also help make code changes if you're busy with other things.

sfishel18 commented 7 months ago

@fershad all good points! it shouldn't be too much work to update my PR to work this way, and i have some time so i should be able to get to it in the next week or so. i'll put the PR into a draft state until it's updated.