gee-community / qgis-earthengine-plugin

Integrates Google Earth Engine and QGIS using Python API
http://qgis-ee-plugin.appspot.com
MIT License
454 stars 116 forks source link

Refactoring the method to import EE lib and auth process, fixing bugs for the new GEE python API #142

Closed XavierCLL closed 1 month ago

XavierCLL commented 8 months ago

Main changes:

This also needs extra testing for us even if the people in the bug tested by themselves, because this changes the authentication method. These changes can also help with the new GEE requirements, because ee.Initialize() now is mandatory in the scripts/session inside the Qgis terminal, see more #143

At least one/two more testers, @gena and @SiggyF ?

Test Windows Linux Mac
GEE Authentication :white_check_mark: Xavier
tester 2
:white_check_mark: Xavier
tester 2
tester 1
tester 2
GEE Example :white_check_mark: Xavier
tester 2
:white_check_mark: Xavier
tester 2
tester 1
tester 2

For testing, use the Artifact at https://github.com/gee-community/qgis-earthengine-plugin/actions/runs/8452162046

XavierCLL commented 7 months ago

Hi @gena and @SiggyF I just added a commit e61e03df93b1cfd88d863e9fdbe9d23139b03c66 in this PR to fix the issue to restore a Qgis project with EE layers. For that:

It works with or without Google Cloud project. Old Qgis project files are not affected and we also allow the user to initialize EE in a specific Google Cloud project in their scripts (to be ready when Google decides only to use EE API through Google Cloud project)

gena commented 6 months ago

From the first look, this sounds like a nice improvement to store the project id in a QGIS project, but requiring users to call ee.Initialize() sounds like a breaking change and does not give me a good feeling (all 100k user scrips will break), I'd try to avoid that.

Let's keep the configuration of the plugin (with a specific project) decoupled from scripts. E.g., similar to how this works in the Code Editor. It sounds better to keep a specific Google Cloud project selection to be related to the plugin configuration and not to a specific script, otherwise we will have situations when the user opens a QGIS project with EE layers, it will render some layers with one project, and then when the user will run another EE script with ee.Initalize(project=...), it will run on another project - this will make debugging and usage monitoring a total mess.

Furthermore, sharing the project id with a QGIS project is bad from security reasons, you don't want to share your project ids with other users when sharing QGIS project files.

gena commented 6 months ago

Let me find a good workaround for this after consulting on how this EE authentication/initialization is supposed to work. We will pick it up with @SiggyF.

For the time being, users who encounter the circular dependency issue can use your binary with a workaround, these are just a few users AFAIK, probably they had this issue due to their machine / Python configuration. Does this sound like a good approach?

XavierCLL commented 6 months ago

From the first look, this sounds like a nice improvement to store the project id in a QGIS project, but requiring users to call ee.Initialize() sounds like a breaking change and does not give me a good feeling (all 100k user scrips will break), I'd try to avoid that.

IMO, we definitely need that ee.Initalize() must be defined always by the user in their script (like colab, geemap, or any GEE-api app need that variable) and mandatory if the user needs to initialize GEE in a g-cloud project (also recommended by the official documentation https://developers.google.com/earth-engine/guides/auth) and for the future changes.

If the user forgets to put ee.Initalize() in their script, a clear error popup saying that the user needs to run it:

image

Besides, any user familiar with GEE knows about that (it is a fundamental element in GEE), so I don't think It will break usability with this change at all. This PR includes changes in the documentation and examples to update the users (it will be good to make a special note in the changelog when we release the next version if we include this change).

Let's keep the configuration of the plugin (with a specific project) decoupled from scripts. E.g., similar to how this works in the Code Editor. It sounds better to keep a specific Google Cloud project selection to be related to the plugin configuration and not to a specific script, otherwise we will have situations when the user opens a QGIS project with EE layers, it will render some layers with one project, and then when the user will run another EE script with ee.Initalize(project=...), it will run on another project - this will make debugging and usage monitoring a total mess.

I don't think that is there a real problem with it. Have you tested, run and loaded different layers with different projects? Moreover this is something that the user needs to be aware of, this is more about the user side than the plugin side. The same situation could happen in colab, geemap, notebooks, and even in the official gee code

Furthermore, sharing the project id with a QGIS project is bad from security reasons, you don't want to share your project ids with other users when sharing QGIS project files.

Ok, that is true. This is a real issue that we need to reconsider. Another option but a more complicated way to do that, is when the project is loading, ask in a window for the project ID, but it is something that the user doesn't want to do every time a project is load. Another option is save it in a local user file.

Another option (the best if we can to do it) is to store the project ID in the credentials file using the equivalent of earthengine set_project my-project so every time the user call ee.Initalize(), init it in the project saved, and every time the user run ee.Initalize(project=project-id) it will update it. Read the Justin's reply that mentioned it here https://github.com/gee-community/qgis-earthengine-plugin/issues/143#issuecomment-2030227202

Last but not least, the original issue of circular dependency that this PR fix, happens in a good portion (I guess only new users, new installation) I'm sure that portion is at least 10 times more than people opening an issue here, so the sooner the better to release a new version.

gena commented 6 months ago

IMO, we definitely need that ee.Initalize() must be defined always by the user in their script (like colab, geemap, or any GEE-api app need that variable) and mandatory if the user needs to initialize GEE in a g-cloud project (also recommended by the official documentation https://developers.google.com/earth-engine/guides/auth) and for the future changes.

If the user forgets to put ee.Initalize() in their script, a clear error popup saying that the user needs to run it:

Besides, any user familiar with GEE knows about that (it is a fundamental element in GEE), so I don't think It will break usability with this change at all. This PR includes changes in the documentation and examples to update the users (it will be good to make a special note in the changelog when we release the next version if we include this change).

There is no doubt that EE needs to be initialized, it just feels too messy to have it in scripts. The QGIS EE plugin imitates Code Editor rather than Colab and users will get a better experience when they don't have to call it every time they write a script. This is also similar to how you work in Cloud Console (you select a project once there and use it everywhere). You can still always override it, it just does not have to be a mandatory option. It will also make scripts easier to share between users without the need to modify project ids every time, keeping the Google Cloud project selection and the script content separated.

Another option (the best if we can to do it) is to store the project ID in the credentials file using the equivalent of earthengine set_project my-project so every time the user call ee.Initalize(), init it in the project saved, and every time the user run ee.Initalize(project=project-id) it will update it. Read the Justin's reply that mentioned it here https://github.com/gee-community/qgis-earthengine-plugin/issues/143#issuecomment-2030227202 Yeah, we can of course also read it from credentials file. But this option feels flawed as well, as the user can also use Python EE as a standalone library, e.g. for CLI / Jupyter, and these will start to interfere with the QGIS EE plugin project selection.

What about storing it in QGIS settings? Something like https://docs.qgis.org/2.18/en/docs/pyqgis_developer_cookbook/settings.html maybe, and then providing a way for users to change it via some configuration UI, maybe also UI to Sign Out / In. Plus initialize it only once when the QGIS EE plugin starts. This would make it look similar to other plugins which require authentication / authorization (e.g. Mapboz, Felt, etc.).

I will consult with a few EE Python API developers as well to make sure we're not missing anything for the future user journeys here and will comment in this thread.

XavierCLL commented 3 months ago

Hi @gena, due to this #150 we need to discuss and resolve it in another PR.

The original idea for this PR was to fix the circular import bug, then I did a rollback of the commit that I tried to resolve partially the #150 in this PR, so now this PR resolve the circular import bug with other changes, and meanwhile we discuss and resolve #150 we need to provide these fixes together with PR #144 because many people are having these issues.

My proposal is to launch a new release with #142 and #144 PR asap, meanwhile we fix #150

In summary, this PR resolves the circular import bug and ee authentication process using the original ee functions (very recommended), however carry with the following changes:

  1. It needs users to add ee.Initalize() in their codes, (but it will be mandatory for #150)
  2. It can load EE objects saved in QGIS projects, but it cannot initialize ee in a Cloud Project (it needs to be fixed in #150), however it will work until the hard deadline
gena commented 2 months ago

The circular dependencies change is a good one, but forcing ee.Initialize() in all scripts is an overkill.

Can you please modify this PR to authenticate and initialize the EE library after the initialization of the plugin, in the init.py classFactory()?

In this case, when the plugin is installed for the first time - it will redirect user to the proper code to start the authentication process. Then we can modify that code to deal with the Cloud Project properly.

XavierCLL commented 1 month ago

@gena ok, done

gena commented 1 month ago

Great, thanks you! I will test it and will experiment with the central project configuration coming weekend to get #150 fixed as well.