Open romainsacchi opened 10 months ago
Hi Romain,
sorry for the late reply. It looks like ecoinvent switched to a single sign on authentication with keycloak. This is not something eidl
can currently support. Some of the code needs to be rewritten to make it work again.
The main problem is that I don't work with LCA/ecoinvent anymore, so I don't have ecoinvent credentials to test and develop this :-(
Does the page after logging in still look the same? If yes, then it's probably a smallish change to make it work again. If no, then eidl
is most likely completely obsolete and needs to be replaced with something new.
Thanks Adrian. Unfortunately, I believe this also affects Activity Browser -- will let them know, maybe they can work on it.
I'll leave the issue open. If it's not fixed then eidl
is pretty much useless.
@romainsacchi We are aware, thanks for the initiative though.
I was able to organize some credentials for testing. Did some quick scripting to check what changed on ecoinvent side:
UN=<username>
PW=<password>
TOKEN=$(curl -d "client_id=apollo-ui" -d "username=$UN" -d "password=$PW" -d "grant_type=password" https://sso.ecoinvent.org/realms/ecoinvent/protocol/openid-connect/token | jq -r .access_token)
curl -H "Authorization: Bearer $TOKEN" https://ecoquery.ecoinvent.org/3.9.1/cutoff/files
<!doctype html><html lang="en"><head><meta charset="utf-8"/><meta name="viewport" content="width=device-width,initial-scale=1"/><link rel="icon" href="/icons/favicon.ico"/><link rel="apple-touch-icon" sizes="180x180" href="/icons/apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="/icons/favicon-32x32.png"/><link rel="icon" type="image/png" sizes="16x16" href="/icons/favicon-16x16.png"/><link rel="manifest" href="/site.webmanifest"/><link rel="mask-icon" href="/icons/safari-pinned-tab.svg" color="#dd1414"/><meta name="msapplication-TileColor" content="#dd1414"/><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap"/><link rel="manifest" href="/manifest.json"/><script defer="defer" src="/static/js/main.4268775c.js"></script><link href="/static/css/main.f04e5175.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div><div id="portal"></div></body></html>
curl -s -H "Authorization: Bearer $TOKEN" https://api.ecoquery.ecoinvent.org/web/versions | jq '.[0:3], {"total: ": length}'
[
{
"version": "3.9.1",
"release_date": "2022-10-01",
"system_model": "apos"
},
{
"version": "3.9.1",
"release_date": "2022-10-01",
"system_model": "cutoff"
},
{
"version": "3.9.1",
"release_date": "2022-10-01",
"system_model": "consequential"
}
]
{
"total: ": 38
}
The bad news:
eidl
of parsing the HTML of ecoquery.ecoinvent.org doesn't work anymore, everything has been replaced by javascriptThe good news:
I haven't found any public documentation for the API, but I'll play around with it a bit and see what I can do. Would be much cleaner anyway to get the required info via API instead of the very brittle HTML parsing "hack" used before.
We can directly ask ecoinvent to provide us with the API documentation if you want. Should I?
Sure, asking can't hurt. Otherwise it should be possible to "reverse engineer" the necessary API calls by using the dev tools in the browser, but I haven't gone so deep yet.
So, I should have communicated this already. Sorry about that.
There is currently no library to consume from the ecoinvent API, and no official documentation for the API endpoints. You should not expect there to be either of these anytime soon.
The SSO uses JSON Web Tokens, and normally logging in requires some shared secret (at least as far as I understand it). I haven't been able to get something working, though this stuff is far outside my comfort level.
You can watch the API calls to build a map of the routes, and probably can figure out what needs to be sent to get the results you want. I think the only difficult thing here is authentication. One additional hiccup could be getting the file downloads, as these URLS are generated per session and link click.
For the time being I think you need to tell people to login and download the file manually and then automate the import. It's not great that the EIDL library is being killed off by actions of others but there isn't much I can do about this, and certainly not right now. Sorry.
Thanks for the feedback @cmutel. To be fair, I'm very surprised that eidl
even worked for as long as it has now. I've been expecting a breaking change for a long time :smile:
I'll see if I can figure something out with the API.
This is a minimal working example with curl and jq to authenticate and download the ecoinvent 3.9.1_cutoff_ecoSpold02.7z
file with the new API:
UN=yourusername
PW=yourpassword
TOKEN=$(curl -s -d "client_id=apollo-ui" -d "username=$UN" -d "password=$PW" -d "grant_type=password" https://sso.ecoinvent.org/realms/ecoinvent/protocol/openid-connect/token | jq -r .access_token)
curl -s https://api.ecoquery.ecoinvent.org/files -H "Authorization: Bearer $TOKEN" > files.json
UUID=$(jq -r '.[] | select(.version_name == "3.9.1") | .releases[] | select(.system_model_name == "Allocation cut-off by classification") | .release_files[] | select(.name == "ecoinvent 3.9.1_cutoff_ecoSpold02.7z") | .uuid' files.json)
download_url=$(curl -s https://api.ecoquery.ecoinvent.org/files/r/$UUID -H "Authorization: Bearer $TOKEN" | jq -r .download_url)
curl $download_url -o ecoinvent_3.9.1_cutoff_ecoSpold02.7z
So it's totally doable, I "only" need to translate these steps to python.
@haasad Wow, amazing! And I had no idea that you could use jquery on the command line, that sort of blows my mind.
See follow-up comment below.
~With this new architecture I think one should rewrite EIDL completely to make it more complete.~ We have already forked it here: https://github.com/brightway-lca/ecoinvent_interface, but I don't really care where the repo is, as long as one can expect it is maintained. That was the original reason for creating a fork.
~Here are the user stories that a new version would address:~
~ As a user, I want to be able to find the integer id of a process given its filename (combination of UUIDs), so that I can perform follow-up operations on that activity~ ~ As a user, I want to be able to find the integer id of a process given its activity, product, location, and unit, so that I can perform follow-up operations on that activity~ ~ As an auditor, I want to be able to get the PDF report on an process, so that I can audit LCIs built on top of ecoinvent~ ~ As a programmer, I want to be able to get a complete ecoinvent release, so that I can install the database locally~ ~ As a user, I want to be able to get the ecospold XML for a single process, so that I can modify or install it myself~ ~ As a tool developer, I want to be able to get LCIA scores for one or more processes, so that I can build quick and simple calculators based on ecoinvent~
~These are real user stories, and some client library needs to support them. @haasad do you think that EIDL could be adapted for this broader functionality, or should we create something new on our own?~
I don't think it makes sense to do a broader refactor now, as the ecoinvent publication API is apparently expected to change a lot in the future.
I have now released eidl 2.0.0
, which works with the new ecoinvent website. It now uses a bearer token in the http request header for authentication and uses the API to find available files instead of parsing the HTML of the page as it used to. Tokens are refreshed automatically before they're used. Available version/system models combinations are still parsed from the filenames like previously as I haven't found a good way to do this with the API.
It should be available soonish on conda-forge
(PR is merged), it's already available on the bsteubing
channel.
@romainsacchi @marc-vdm I've tested it stand-alone and in the activity-browser and everything seems to work. But I'd be grateful if you could test it additionally and let me know if you encounter any issues.
One thing I haven't figured out yet is how it works/breaks for users with restricted ecoinvent licenses. (see #28 and https://github.com/LCA-ActivityBrowser/activity-browser/issues/775). This info was previously available as an HTML tag.
@cmutel:
We have already forked it here: https://github.com/brightway-lca/ecoinvent_interface, but I don't really care where the repo is, as long as one can expect it is maintained. That was the original reason for creating a fork.
I'm happy to continue maintaining eidl
in its current scope (I've never really stopped). It's pretty essential for the activity-browser to work if users want a GUI only experience. Besides the download from the ecoinvent page, the cross-platform 7zip extraction was a big inconvenience before eidl
.
But eidl
's scope is pretty limited, I think you'd be better off with a dedicated API client library with proper documentation for the type of user stories you mentioned above.
One thing I haven't figured out yet is how it works/breaks for users with restricted ecoinvent licenses. (see https://github.com/haasad/EcoInventDownLoader/pull/28 and https://github.com/LCA-ActivityBrowser/activity-browser/issues/775). This info was previously available as an HTML tag.
The /files endpoint currently lists only files that are accessible so there is no need for additional filtering.
One thing I haven't figured out yet is how it works/breaks for users with restricted ecoinvent licenses. (see #28 and LCA-ActivityBrowser/activity-browser#775). This info was previously available as an HTML tag.
The /files endpoint currently lists only files that are accessible so there is no need for additional filtering.
That's what I was hoping for and why I didn't use the publicly available /web/versions
endpoint, thank you for confirming :+1:
@haasad IIRC https://github.com/brightway-lca/ecoinvent_interface main difference at this point is the persistence in login credentials [1]
class Settings(BaseSettings):
username: Optional[str]
password: Optional[SecretStr]
@jsvgoncalves I wasn't aware of the fork before this discussion. I'll gladly accept a pull request if this would be useful for you. At first glance it looks like most of the other additional features (pdf download, logged_in decorator etc) are also broken with the new website.
@cmutel @cedric-roussel @jsvgoncalves I'm also totally open to discuss transferring this repo to the brightway-lca or ecoinvent orgs on github if you like. Or adding you as maintainers here if you feel like you can't depend on me reacting fast enough to issues. In the end this is just a tool I wrote more than 5 years ago, because it was useful for me at the time. I don't actively use it anymore. But I keep investing some effort into it from time to time, because it seems to be useful for others as well. Especially for the ActivityBrowser folks, it's pretty tightly integrated there. In my opinion it makes sense to keep this repo, because it's the source for the conda-forge packaging etc. (https://github.com/conda-forge/eidl-feedstock).
I like it, that was fast 👍
There is one pitfall currently with legal agreements: they need to be accepted on the website by all new and returning users. Unfortunately, there won't be a clear message provided by the API. Users who have never logged into the new website will receive an empty list, even with a valid license.
@haasad Can confirm this is now working after a regular update conda update activity-browser
Kind regards,
Marc van der Meide PhD Candidate
Leiden University | Faculty of Science - Institute of environmental sciences (CML)
Einsteinweg 2 | Leiden 2333 CC | linkedinhttps://www.linkedin.com/in/marcvandermeide/?locale=en_US
From: Adrian Haas @.> Sent: Saturday, August 26, 2023 09:20 To: haasad/EcoInventDownLoader @.> Cc: Meide, M.T. van der (Marc) @.>; Mention @.> Subject: Re: [haasad/EcoInventDownLoader] Logging issue (Issue #30)
I have now released eidl 2.0.0, which works with the new ecoinvent website. It now uses a bearer token in the http request header for authentication and uses the API to find available files instead of parsing the HTML of the page as it used to. Tokens are refreshed automatically before they're used. Available version/system models combinations are still parsed from the filenames like previously as I haven't found a good way to do this with the API.
It should be available soonish on conda-forge (PR is merged), it's already available on the bsteubing channel.
@romainsacchihttps://github.com/romainsacchi @marc-vdmhttps://github.com/marc-vdm I've tested it stand-alone and in the activity-browser and everything seems to work. But I'd be grateful if you could test it additionally and let me know if you encounter any issues.
One thing I haven't figured out yet is how it works/breaks for users with restricted ecoinvent licenses. (see #28https://github.com/haasad/EcoInventDownLoader/pull/28 and LCA-ActivityBrowser/activity-browser#775https://github.com/LCA-ActivityBrowser/activity-browser/issues/775). This info was previously available as an HTML tag.
@cmutelhttps://github.com/cmutel:
We have already forked it here: https://github.com/brightway-lca/ecoinvent_interface, but I don't really care where the repo is, as long as one can expect it is maintained. That was the original reason for creating a fork.
I'm happy to continue maintaining eidl in its current scope (I've never really stopped). It's pretty essential for the activity-browser to work if users want a GUI only experience. Besides the download from the ecoinvent page, the cross-platform 7zip extraction was a big inconvenience before eidl.
But eidl's scope is pretty limited, I think you'd be better off with a dedicated API client library with proper documentation for the type of users stories you mentioned above.
— Reply to this email directly, view it on GitHubhttps://github.com/haasad/EcoInventDownLoader/issues/30#issuecomment-1694209805, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIIFUDTR2W6Z6BVUVWZJQZDXXGPVTANCNFSM6AAAAAA3YJXT5U. You are receiving this because you were mentioned.Message ID: @.***>
@haasad Thanks very much for figuring out the tokens code. I have rewritten most of the ecoinvent_interface
code using this new approach here: https://github.com/brightway-lca/ecoinvent_interface/tree/two-ooh. The code works, but testing and documentation is still very much TBD.
I chose to go a different direction for this library; see the differences with EIDL here: https://github.com/brightway-lca/ecoinvent_interface/tree/two-ooh#relationship-to-eidl
In the end I think it is fine to have two libraries, at least for now. By Brightcon the ecoinvent_interface
needs to have the ability to get process documentation as well.
Could there be a logging issue now that ecoinvent changed its web interface?
I struggle to login even though I'm rather confident to input the correct user id/password. In fact, I can login on the website... but not via
eidl
.