arangodb / python-arango

The official ArangoDB Python driver.
https://docs.python-arango.com
MIT License
444 stars 72 forks source link

Requests with endpoint="/_api/simple/..." fail #270

Closed dpro-shc closed 1 year ago

dpro-shc commented 1 year ago

I noticed that calling some methods yield errors:

from arango import ArangoClient

# Initialize the ArangoDB client.
client = ArangoClient(hosts='http://arango.myhostname.com')

db = client.db(**creds)
raw_streams = db.collection('raw_streams')
raw_streams.name
raw_streams.all()

produces:

`DocumentGetError Traceback (most recent call last) Cell In[7], line 3 1 raw_streams = db.collection('raw_streams') 2 raw_streams.name ----> 3 raw_streams.all()

File ~\anaconda3\lib\site-packages\arango\collection.py:630, in Collection.all(self, skip, limit) 627 raise DocumentGetError(resp, request) 628 return Cursor(self._conn, resp.body) --> 630 return self._execute(request, response_handler)

File ~\anaconda3\lib\site-packages\arango\api.py:74, in ApiGroup._execute(self, request, response_handler) 63 def _execute( 64 self, request: Request, response_handler: Callable[[Response], T] 65 ) -> Result[T]: 66 """Execute an API. 67 68 :param request: HTTP request. (...) 72 :return: API execution result. 73 """ ---> 74 return self._executor.execute(request, response_handler)

File ~\anaconda3\lib\site-packages\arango\executor.py:65, in DefaultApiExecutor.execute(self, request, response_handler) 56 """Execute an API request and return the result. 57 58 :param request: HTTP request. (...) 62 :return: API execution result. 63 """ 64 resp = self._conn.send_request(request) ---> 65 return response_handler(resp)

File ~\anaconda3\lib\site-packages\arango\collection.py:627, in Collection.all..response_handler(resp) 625 def response_handler(resp: Response) -> Cursor: 626 if not resp.is_success: --> 627 raise DocumentGetError(resp, request) 628 return Cursor(self._conn, resp.body)

DocumentGetError: [HTTP 400][ERR 17] expecting string for `

I have similar issues whenever I call a method which makes a request using "/_api/simple/" Am I doing something wrong? I'm using Arango 3.11.2, python-arango V7.6, python 3.11.3

apetenchea commented 1 year ago

Hello @dpro-shc,

Based on the error you provided, I have a few observations and questions:

I tried replicating the error on my end, but everything seems to work as expected. I'll post a working example below for your reference. Perhaps you can compare it with your code to see if there are any differences.

from arango import ArangoClient
client = ArangoClient(hosts="http://127.0.0.1:8529")
db = client.db('_system', username='root', password='passwd', verify=True)
if db.has_collection('raw_streams'):
  raw_streams = db.collection('raw_streams')
else:
  raw_streams = db.create_collection('raw_streams')
raw_streams.insert({"foo": "bar"})
print(list(raw_streams.all()))
dpro-shc commented 1 year ago

Hi @apetenchea,

My code matches your example and throws the insert error. I'm not connecting to a localhost, maybe that could be part of the problem?

apetenchea commented 1 year ago

Document insertion uses _api/document. I would've expected the error to show something like _api/document/raw_streams/, but it does not. I suspect the collection name is not passed correctly in the request. Could you try to bypass the driver, and see what a POST request returns? The driver itself does more or less the same, it just composes the request and sends it over to the coordinator. See the example in the documentation on how to insert a document using curl. For the sake of simplicity, please try it out using a database that does not use credentials nor JWT.

If you want to see exactly how the request sent by your driver looks like, you may add the following patch to arango/connection.py:

diff --git a/arango/connection.py b/arango/connection.py
index 49aa7b6..195a9eb 100644
--- a/arango/connection.py
+++ b/arango/connection.py
@@ -128,6 +128,7 @@ class BaseConnection:
         tries = 0
         indexes_to_filter: Set[int] = set()
         while tries < self._host_resolver.max_tries:
+            print(self._url_prefixes[host_index] + request.endpoint)
             try:
                 resp = self._http.send_request(
                     session=self._sessions[host_index],

To get the location of your package run pip show python-arango. Don't forget to undo this change once you're done experimenting.

dpro-shc commented 1 year ago

I just tried out bypassing the driver in a few ways:

locally on my computer through swaggerHub, which worked flawlessly and through an app running on a separate container, which didn't work.

The POST that failed yielded: "status":"rejected", "reason":{ "message":"404 - "{\"code\":404,\"error\":true,\"errorMessage\":\"expecting GET /_api/document/<collection>/<key>\",\"errorNum\":1203}"", "name":"Error",

using { headers: {…}, method: "POST", uri: "http://arango.myHost.com/_db/myDB/_api/document/raw_streams"

I tried a GET on the collection it was successful on the container, returning the expected collection properties from the url: http://arango.myHost.com/_db/myDB/_api/collection/raw_streams

With the added code in the driver, the insert printed: http://arango.myHost.com/_db/myDB/_api/document/raw_streams, which exactly what I used for testing in swaggerHub and worked successfully.

For raw_streams.all() I got: http://arango.myHost.com/_db/myDB/_api/simple/all

apetenchea commented 1 year ago

With the added code in the driver, the insert printed: http://arango.myHost.com/_db/myDB/_api/document/raw_streams, which exactly what I used for testing in swaggerHub and worked successfully.

The insert worked successfully from the python driver, or just from swaggerHub?

I changed the print statement in the driver code to print slightly more details from the request:

print(self._url_prefixes[host_index] + request.endpoint, request.data, request.method, request.params, request.headers)

The code I am using to test it is this:

from arango import ArangoClient
client = ArangoClient(hosts="http://127.0.0.1:8529")
db = client.db('_system', username='root', password='passwd', verify=True)
db.delete_collection('raw_streams')
if db.has_collection('raw_streams'):
  raw_streams = db.collection('raw_streams')
else:
  raw_streams = db.create_collection('raw_streams')
raw_streams.insert({"foo": "bar"})
print(list(raw_streams.all()))

I am getting the following output:

http://127.0.0.1:8529/_db/_system/_api/collection None get {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
http://127.0.0.1:8529/_db/_system/_api/collection/raw_streams None delete {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
http://127.0.0.1:8529/_db/_system/_api/collection None get {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
http://127.0.0.1:8529/_db/_system/_api/collection {'name': 'raw_streams', 'waitForSync': False, 'isSystem': False, 'keyOptions': {'type': 'traditional', 'allowUserKeys': True}, 'type': 2} post {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
http://127.0.0.1:8529/_db/_system/_api/document/raw_streams {'foo': 'bar'} post {'returnNew': '0', 'silent': '0', 'overwrite': '0', 'returnOld': '0'} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
http://127.0.0.1:8529/_db/_system/_api/simple/all {'collection': 'raw_streams'} put {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.8 ()'}
[{'_key': '6024109', '_id': 'raw_streams/6024109', '_rev': '_gdnjSDW---', 'foo': 'bar'}]

Does it look similar if you try it?

dpro-shc commented 1 year ago

The insert did not work on the python driver, but it did on swagger. Also I tried using pyArango, which has been working flawlessly.

http://arango.myHost.com/_db/myDb/_api/simple/all {'collection': 'raw_streams'} put {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.3 ()'}

http://arango.myHost.com/_db/myDb/_api/document/raw_streams {'_key': 'foo'} post {'returnNew': '0', 'silent': '0', 'overwrite': '0', 'returnOld': '0'} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.3 ()'}

db.create_collection("test2") returns WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'RemoteDisconnected('Remote end closed connection without response')': /_db/dwh/_api/collection http://arango.myHost.com/_db/myDb/_api/collection {'name': 'test2', 'waitForSync': False, 'isSystem': False, 'keyOptions': {'type': 'traditional', 'allowUserKeys': True}, 'type': 2} post {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.3 ()'} http://arango.myHost.com/_db/myDb/_api/collection {'name': 'test2', 'waitForSync': False, 'isSystem': False, 'keyOptions': {'type': 'traditional', 'allowUserKeys': True}, 'type': 2} post {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.3 ()'}

I think this may not be an issue with the python-arango and might be something I've got wrong in the DB config. Sorry for the confusion. Do you have any ideas?

apetenchea commented 1 year ago

The requests seem to be fine. I can suggest updating the driver, although that shouldn't make much of a difference. The warning generated by urllib3 looks like an indication of connection issues. Regarding raw_streams.all() or other calls that go through /_api/simple/, note that /api/simple/ is deprecated since ArangoDB 3.4. It should still work though, but it's something we plan on replacing throughout the driver anyway. In the meantime, you can achieve the same using an AQL query:

print(list(db.aql.execute('FOR doc IN raw_streams RETURN doc')))
dpro-shc commented 1 year ago

Unfortunately db.aql.execute("FOR doc IN raw_sources RETURN doc") prints this: http://arango.myHost.com/_db/myDB/_api/cursor {'query': 'FOR doc IN raw_sources RETURN doc', 'count': False, 'memoryLimit': 0} post {} {'charset': 'utf-8', 'content-type': 'application/json', 'x-arango-driver': 'python-arango/7.5.3 ()'}

and raises the error: AQLQueryExecuteError: [HTTP 405][ERR 405] method not supported

If it is a connection error, would you have any idea what it might be?

apetenchea commented 1 year ago

It looks like the request has reached some server. Are you behind a proxy or something? I just don't get it, why does it say "method not supported", when it is clearly a POST request that should work fine 🤨 Are you the one managing arango.myHost.com? If yes, you're running a cluster, or a is it a single server setup? I am finding it a bit weird that there's not port number. In a cluster setup, the requests should reach one of the coordinator servers (usually port 8529 or 8530). Have you set up any ports? If you send a request to an HTTP URL without specifying the port, the default port for the HTTP protocol is 80.

dpro-shc commented 1 year ago

my gosh, you're right. Sending requests to https fixed the problems 🤦 Thank you for the help and sorry for the confusion!