Closed jdkoren closed 1 year ago
Thank you for reporting this - the local development server started by the appengineRun
task should not be communicating with prod data, and it is odd that a local datastore file seems to be logged but not used or found. Sorry to hear the frustrations encountered there!
To help us troubleshoot and better understand the behavior discrepancy - could you help share more details of the setup under which this was encountered? In particular:
app.yaml
or appengine-web.xml
based project, and what is the java runtime and app engine environment?Are you using an app.yaml or appengine-web.xml based project, and what is the java runtime and app engine environment?
The app is using appengine-web.xml
, java8 runtime, and (I think) the standard environment.
How is the run configuration set up for the project?
We currently do not have any run configuration, we only configure appengine.deploy
.
Is the project doing any custom configuration of the local datastore location?
AFAIK we do not have any custom configuration of the local datastore location.
@jdkoren Have you tried following the directions here: https://cloud.google.com/datastore/docs/tools/datastore-emulator#automatically_setting_the_variables?
Can you clarify which version of the Maven plugin you were using, and which version of the Gradle one you're using? They should be equivalent in behavior.
cc/ @ludoch
This is really weird... There is no need to extra cloud datastore emulator (used for non GAE local apps). The local GAE Dev AppServer boots at the same time in the same JVM the local devappserver as well as all the GAE API emulators. It seems you are also saying it works with Maven and does not work with Gradle? Can you share for both the app engine plugin settings in pom or build files?
Again there is no need for a Java8 GAE app to ever use gcloud beta emulators datastore, unless you are not using the com.google.appengine.api.datastore API classes of course.
@meltsufin @ludoch
Can you clarify which version of the Maven plugin you were using, and which version of the Gradle one you're using?
Previously we were using appengine-maven-plugin v2.4.1
<plugin>
<groupId>com.google.cloud.tools</groupId>
<artifactId>appengine-maven-plugin</artifactId>
<version>2.4.1</version>
</plugin>
We are now using appengine-gradle-plugin v2.4.3
buildscript {
dependencies {
classpath 'com.google.cloud.tools:appengine-gradle-plugin:2.4.3'
}
}
apply plugin: 'com.google.cloud.tools.appengine'
Have you tried following the directions here: https://cloud.google.com/datastore/docs/tools/datastore-emulator#automatically_setting_the_variables
Yes, I did follow the directions to set the environment variables. It might also be noteworthy that I needed to set an additional environment variable, otherwise I got an error when trying to start the test server:
export DATASTORE_USE_PROJECT_ID_AS_APP_ID=true
According to this doc
Note: We are migrating the local development environment to use the Cloud Datastore Emulator, For more information about this change, see the migration guide.
and this doc linked there (which I am not 100% sure if this applies to Java as well),
Cloud Datastore Emulator is progressively being rolled out as the default Datastore implementation for dev_appserver.
The Cloud Datastore Emulator is the default emulator for a portion of dev_appserver users. If you are using the Cloud Datastore Emulator, dev_appserver will display:
... Using Cloud Datastore Emulator.
(emphasis added by me)
So I think the first thing to check is whether the dev appserver is launching its own legacy local emulator or the Cloud Datastore emulator (which should run in a separate process, it seems), as well as if you are manually running your own emulator on top of it. Sounds like you are also launching your own emulator outside the dev appserver, so I guess the next would be to figure out if your app is connecting to the emulator run by the dev appserver or by you?
@chanseokoh My original comment was about using appengineRun
alone; it was only after running into this issue that I found the page about the Datastore Emulator. Without starting an emulator (and without setting any of the associated environment variables), I see the following in the logs:
INFO: Local Datastore initialized:
Type: High Replication
Storage: {project path}/build/exploded-samplesindex/WEB-INF/appengine-generated/local_db.bin
That file does not exist, and one of the endpoint URLs immediately returns data that is in our live Cloud Datastore. I don't know how to check whether the dev appserver is launching an emulator or whether my app is connecting to it.
@jdkoren Is there any chance you can provide a minimal sample project that reproduces the issue, preferably with both a Maven and a Gradle build file? It's really surprising to see this kind of difference with the two plugins that were written to be equivalent.
@jdkoren Thanks for the pointers to the project code! Looking through it, I wonder if the use of Objectify and this Objectify issue is related here. The way the project initializes ObjectifyFactory()
looks like what’s described in this comment, and perhaps this is bypassing what's being launched by the local dev app server.
That said, it’s still puzzling why this only surfaced after the move from maven to gradle - were there other significant changes (perhaps related to Objectify or environment variables) that were made along with this move?
One subtle difference I noticed in the gradle plugin (compared to maven) is that run.projectId
defaults to deploy.projectId
's value when not explicitly configured (source).
It doesn’t explain the main behavior of [wrong datastore being used], but may have played a part in allowing the communication - you can try explicitly setting this to something different in the run configuration. Changing this to align with maven could be a fix we want to make here, for safeguarding against scenarios like this.
@emmileaf Looking back through recent changes, we did update Objectify from version 5.0.3
to 6.0.9
. I'm currently trying to follow the instructions here to see if I can make Objectify connect to a datastore emulator.
This looks promising. Thanks for the update!
From @jdkoren:
After some experimentation I can confirm that the ClassNotFoundException that I'm hitting is a problem that occurs when I connect Objectify to the local datastore emulator. It happens regardless of whether I use the maven plugin or the gradle plugin. If I don't run the datastore emulator, Objectify just connects to live Datastore (again regardless of which plugin).
Since the issue is not in the plugins, I'm closing the issue, but feel free to continue the discussion.
According to this doc
This doc is for the Python dev_appserver (a py progrram), not for Java which does not use the external datastore emulator.
According to this doc
This doc is for the Python dev_appserver (a py progrram), not for Java which does not use the external datastore emulator.
@ludoch It seems you are right. I found the doc for java, which states the following:
The development web server simulates Datastore using a local file-backed Datastore on your computer. The Datastore is named local_db.bin, and it is created in your application's WAR directory, in the WEB-INF /appengine-generated/ directory. It is not uploaded with your application.
I see no mention of running a datastore emulator separately at all on this page.
My team has a Java web app using Appengine and Cloud Endpoints, and it interacts with Cloud Datastore via the Objectify library. We recently migrated our build system from Maven to Gradle.
Problem
We discovered that by default a local test server started with the
appengineRun
task interacts with live production data in Cloud Datastore. This is markedly different from theappengine:run
Maven goal, which always used a local datastore, and that was the behavior we were expecting. (Apparently we need to use a datastore emulator to get that behavior.)As a result, my local test server with unfinished and experimental code modified our live production data and caused considerable disruption. This is startling and undesirable for a default behavior, and makes it easy for developers to shoot themselves in the foot.
Incidentally, the following message is printed in the local server log at startup and would mislead someone to believe that the test server uses a local datastore:
I spent an inordinate amount of time trying to find that file or any file like it on my filesystem, but none existed, which further corroborated that the test server was communicating with our project's live Cloud Datastore.
Proposed solutions
(These are not mutually exclusive and may not be completely formed ideas.)
I. Remove misleading log message
Don't print a message saying that a local datastore was initialized if it wasn't.
II. Don't allow writes without using emulation
Allow the current behavior of communicating with live Cloud Datastore data, but only allow reads.
III. Require explicit allowance when not using emulation
Make the developer explicitly allow the test server to interact with live Cloud Datastore. Imagine a configurable property like this:
When
appengineRun
is invoked, check if there's a Datastore emulator. If there isn't, and the above property is not set or is set tofalse
, exit with an error and print a message explaining how to resolve.IV. Use managed emulation
What if the Gradle plugin could start and stop datastore emulation automatically? This way the developer can just invoke
appengineRun
, and can avoid accidents that result from forgetting to start the emulator themself. Imagine a configuration block like this, where all the properties could have reasonable defaults (and therefore be omitted):These would simply be passed as arguments to an invocation of
gcloud beta emulators datastore start
behind the scenes.