googleapis / python-ndb

Apache License 2.0
150 stars 66 forks source link

NDB: Query with UserProperty results in "An entity value is not allowed" error #1002

Open yihaoWang opened 1 month ago

yihaoWang commented 1 month ago

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

If you are still having issues, please be sure to include as much information as possible:

Environment details

  1. API: Google Cloud NDB
  2. OS type and version: macOS 14.5
  3. Python version: Python 3.10.9 (using pyenv)
  4. google-cloud-ndb version: 2.3.2 (using pip show google-cloud-ndb)

Steps to reproduce

  1. Create a model class with a UserProperty field, such as TestModel.
  2. Use the users.User object to create and store an instance of TestModel.
  3. Attempt to query the stored instance based on the UserProperty.
  4. Observe the error when trying to retrieve the result.

Code example

from google.cloud import ndb

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

from google.appengine.api import users
from junyi.activity.test_model import TestModel
from testutil.gae_model import GAEModelTestCase

class TestTestModel(GAEModelTestCase):
    def test_get_user_data(self):
        user = users.User(email="test@test.com")
        test_model = TestModel(owner=user)
        test_model.put()
        query = TestModel.query().filter(TestModel.owner == user)
        result = query.get()
        print("result", result)

Stack trace

Traceback (most recent call last):
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_api.py", line 98, in rpc_call
    result = yield rpc
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "An entity value is not allowed"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"An entity value is not allowed", grpc_status:3, created_time:"2024-10-07T08:15:48.775244+08:00"}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/yihaowang/junyi/junyiacademy/junyi/activity/test_model_test.py", line 11, in test_get_user_data
    result = query.get()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 1201, in wrapper
    return wrapped(self, *dummy_args, _options=query_options)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/utils.py", line 118, in wrapper
    return wrapped(*args, **new_kwargs)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/utils.py", line 150, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 2067, in get
    return self.get_async(_options=kwargs["_options"]).result()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 210, in result
    self.check_success()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 157, in check_success
    raise self._exception
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 2101, in get_async
    results = yield _datastore_query.fetch(options)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 116, in fetch
    while (yield results.has_next_async()):
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 343, in has_next_async
    yield self._next_batch()  # First time
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 373, in _next_batch
    response = yield _datastore_run_query(query)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 1030, in _datastore_run_query
    response = yield _datastore_api.make_call(
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_retry.py", line 97, in retry_wrapper
    raise error
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_retry.py", line 82, in retry_wrapper
    result = yield result
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_api.py", line 102, in rpc_call
    raise error
google.api_core.exceptions.InvalidArgument: 400 An entity value is not allowed

Thanks!

fbukevin commented 1 month ago

It looks some wired logic in your code. You declare a class name TestModel.

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

Then you import a custom module with the same name.

from junyi.activity.test_model import TestModel
...
        test_model = TestModel(owner=user)

Won't it be conflicts when you invoke it?

yihaoWang commented 1 month ago

It looks some wired logic in your code. You declare a class name TestModel.

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

Then you import a custom module with the same name.

from junyi.activity.test_model import TestModel
...
        test_model = TestModel(owner=user)

Won't it be conflicts when you invoke it?

Apologies for the confusion. I've simplified the sample code to make it more straightforward and reproducible. Please check if the following code works for you

import os
import sys
from google.appengine.api import users
from google.cloud import ndb
import dev_appserver

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

def init_ndb():
    os.environ["AUTH_DOMAIN"] = "example.com"

    ndb_client = ndb.Client(project="test")
    return ndb_client

def test_get_user_data():
    ndb_client = init_ndb()
    with ndb_client.context():
        user = users.User(email="test@test.com")
        test_model = TestModel(owner=user)
        test_model.put()
        query = TestModel.query().filter(TestModel.owner == user)
        result = query.get()
        print("result", result)

if __name__ == "__main__":
    test_get_user_data()
fbukevin commented 1 month ago

@yihaoWang

With removing import dev_appserver, which is only supported in Python2 Client Library, I can reproduce the same error as you.

Actually, this error "An entity value is not allowed" mostly occurs when you attemp to store types like google.appengine.api.users.User into Google Cloud NDB module. However, the value of this data type is not supported by Cloud NDB. Cloud NDB library implemented updating for Python 3. It no longer support some data types or modules of App Ebgine.

The line ndb.UserProperty() is the old API of App Engine. It's not compatible to Python 3 with Cloud NDB library

A solution is that using custom defined property to store data such as email. You can create a user instances with google.appengine.api.users and store email by using user.email(), instead of storing entire users.User object (i.e. owner = ndb.UserProperty()).

Here is an example based on amended your code:

import os
import sys
from google.appengine.api import users
from google.cloud import ndb

class TestModel(ndb.Model):
    owner_email = ndb.StringProperty()

def init_ndb():
    os.environ["AUTH_DOMAIN"] = "example.com"
    ndb_client = ndb.Client(project="example-project")
    return ndb_client

def test_get_user_data():
    ndb_client = init_ndb()
    with ndb_client.context():
        user = users.User(email="test@test.com")
        test_model = TestModel(owner_email=user.email())
        test_model.put()
        query = TestModel.query().filter(TestModel.owner_email == user.email())
        result = query.get()
        print("result", result)

if __name__ == "__main__":
    test_get_user_data()

And the result of query is:

result TestModel(key=Key('TestModel', 5644004762845184), owner_email='test@test.com')
youchenlee commented 2 weeks ago

Actually, this error "An entity value is not allowed" mostly occurs when you attemp to store types like google.appengine.api.users.User into Google Cloud NDB module.

@fbukevin test_model.put() succeeds, but the error appears at:

result = query.get()
fbukevin commented 2 weeks ago

Sorry about the confused wording. The meaning of the statement is that if you attempt to store type google.appengine.api.users.User into Google Cloud NDB module, when you try to get it with user property, it could lead to the error.

youchenlee commented 2 weeks ago

Thank you, @fbukevin .

We are migrating from google.appengine.ext.db to Cloud NDB. Many existing models and queries rely on UserProperty. Is there any workaround to make these queries compatible without needing to migrate billions of rows of data? 😢

youchenlee commented 2 weeks ago

https://github.com/googleapis/python-ndb/blob/c55ec62b5153787404488b046c4bf6ffa02fee64/google/cloud/ndb/model.py#L3240-L3243

According to the comment, this issue can be resolved by adding meaning = 20 to the final request

project_id: "test"
partition_id {
  project_id: "test"
}
read_options {
}
query {
  kind {
    name: "TestModel"
  }
  filter {
    property_filter {
      property {
        name: "owner"
      }
      op: EQUAL
      value {
        entity_value {
          properties {
            key: "email"
            value {
              string_value: "test@test.com"
              exclude_from_indexes: true
            }
          }
          properties {
            key: "auth_domain"
            value {
              string_value: "example.com"
              exclude_from_indexes: true
            }
          }
        }
        meaning: 20 ### Added this ###
      }
    }
  }
  limit {
    value: 1
  }
}

This was done using a temporary hack in the code:

+++ /site-packages/google/cloud/datastore/helpers.py       2024-11-01 02:21:15.749998162 +0800
@@ -498,6 +498,8 @@
     elif attr == "entity_value":
         entity_pb = entity_to_protobuf(val)
         value_pb.entity_value.CopyFrom(entity_pb._pb)
+        if 'auth_domain' in val.keys() and 'email' in val.keys():
+            value_pb.meaning = 20
     elif attr == "array_value":
         if len(val) == 0:
             array_value = entity_pb2.ArrayValue(values=[])._pb

Looking forward to a better fix, where the correct meaning is applied when encountering UserProperty or appengine User object.

fbukevin commented 2 weeks ago

Hi @youchenlee ,

You can achieve your requirement like this as a workaround. I suggest you can create a pull request with your fixing. Either Google engineering team have their consideration to design like this, or it is a bug and can be accepted 🙂.

fbukevin commented 2 weeks ago

@googleapis

Reproduce steps

class TestModel(ndb.Model): owner = ndb.UserProperty()

def init_ndb(): os.environ["AUTH_DOMAIN"] = "ikala.ai"

ndb_client = ndb.Client(project="cloud-sa-sandbox-1")
return ndb_client

def test_get_user_data(): ndb_client = init_ndb() with ndb_client.context(): user = users.User(email="test@test.com") test_model = TestModel(owner=user) test_model.put() query = TestModel.query().filter(TestModel.owner == user) result = query.get() print("result", result)

if name == "main": test_get_user_data()



**Demo**

https://github.com/user-attachments/assets/aabdf2a1-1e29-4395-ab01-7cff9da630d2