davebshow / aiogremlin

http://aiogremlin.readthedocs.org/en/latest/
Other
46 stars 13 forks source link

Connect to Azure CosmosDB graph flavour #13

Open adrpino opened 6 years ago

adrpino commented 6 years ago

Azure provides a graph "flavoured" version of their CosmosDB. There is a little quickstart here. They expose a gremlin endpoint, i.e. the websocket connection, some info here.

The problem I'm finding with aiogremlin is that I need to provide a certfile and keyfile in aiogremlin/driver/server.py. Can I adapt the code be to connect to that service?

davebshow commented 6 years ago

I'm not 100% sure that I follow. I haven't tested it, but you should be able to talk to the Azure endpoint with the aiogremlin.Client object. GLV stuff won't work because (afaik) they don't support bytecode yet.

Seems like you are having an issue with ssl? So, is the issue that you want to connect to a secure websocket (wss), but don't have a certificate? Can you please show me your config, sample code, and the error you are getting.

adrpino commented 6 years ago

When you create a CosmosDB Graph flavoured database, you basically create a database, say test-database and a graph, test-graph.

The issue that I'm having is that the connection options provided in Azure are the following:

alt text

It provides a key that you can use, for instance, with a Thinkerpop gremlin console using a remote-secure.yaml file such as:

hosts: [foo-test.graphs.azure.com]
port: 443
# In Azure the username is built as a URI to your graph
username: /dbs/test-database/colls/test-graph
password: "password string you see in the image above"
connectionPool: {
  enableSsl: true}
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { serializeResultToString: true }}

It works spitting this message:

WARN  org.apache.tinkerpop.gremlin.driver.Cluster  - SSL configured without a trustCertChainFile and thus trusts all certificates without verification (not suitable for production)

In this way I can connect to the Azure DB. I don't know how gremlin console handles security of this connection and how it verifies the connection, but it runs for testing purposes.

What I don't know is how to use the Cluster(**options) in order to accommodate to Azure settings in the remote-secure.yaml above.

It would be great if there could be a very simple tutorial on how to use your library with CosmosDB since it's by far the best in python. I'm willing to make a PR with documentation if it's OK for you and I make it work with this service. Congrats by the way for the work done.

davebshow commented 6 years ago

You should be able to use a very similar configuration file with aiogremlin. Configuration documentation can be found here: http://aiogremlin.readthedocs.io/en/latest/usage.html#configuring-the-cluster-object

Regarding the tutorial. I don't think I would want to include it as part of the documentation, because then I would feel like I would need to add tutorials for other providers as well. That said, if you would like to write a blog post style tutorial about how to use aiogremlin with CosmosDB that would be fantastic. I would be happy to answer any questions.

Finally, I am not sure if they have made it work yet, but I know that a couple guys at Microsoft were playing around with aiogremlin and CosmosDB....@vivekr20 any progress there?

davebshow commented 6 years ago

Any luck with this?

vivekr20 commented 6 years ago

Hey David, apologies for the late reply on this, I was OOF on vacation for a good part of last month.

We hadn't gotten to testing our gemlin endpoint with aiogremlin yet. That said, we recently did address the bug on our websocket server which was causing issues with certain clients (including aiogremlin), when sending back responses.

@adrpino were you able to make any progress?

soderluk commented 6 years ago

@adrpino: Did you make any progress on this? I have the same issue, using the await Cluster.open(loop, configfile='config.yml') approach, it hangs. When configuring the scheme: wss it breaks on

Traceback (most recent call last):
  File "aiogremlin_test.py", line 98, in <module>
    client = loop.run_until_complete(get_client(loop))
  File "/usr/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
    return future.result()
  File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(None)
  File "aiogremlin_test.py", line 70, in get_client
    cluster = await Cluster.open(loop, configfile='config.yml')
  File "/home/vagrant/.virtualenvs/setup_scripts/lib/python3.5/site-packages/aiogremlin/driver/cluster.py", line 86, in open
    await cluster.establish_hosts()
  File "/home/vagrant/.virtualenvs/setup_scripts/lib/python3.5/site-packages/aiogremlin/driver/cluster.py", line 134, in establish_hosts
    url, self._loop, **dict(self._config))
  File "/home/vagrant/.virtualenvs/setup_scripts/lib/python3.5/site-packages/aiogremlin/driver/server.py", line 97, in open
    host = cls(url, loop, **config)
  File "/home/vagrant/.virtualenvs/setup_scripts/lib/python3.5/site-packages/aiogremlin/driver/server.py", line 34, in __init__
    certfile, keyfile=keyfile, password=ssl_password)
FileNotFoundError: [Errno 2] No such file or directory

I wonder, why does aiogremlin insist on having the ssl_certfile and ssl_keyfile configured? Working against my Azure Cosmos graph DB.

soderluk commented 6 years ago

Found out that if I comment out the if-statement that checks the scheme for https or wss, in class GremlinServer()'s init function, it works. aiogremlin/driver/server.py: 28:

if scheme in ['https', 'wss']:
...

When working against Cosmos DB there is no need for the certfile or keyfile.

allarp commented 4 years ago

@soderluk Did you find a more sustainable solution to this?

soderluk commented 4 years ago

@petterj, well not exactly. I switched to use ArangoDB and the python drivers for that.

allarp commented 4 years ago

@soderluk Thanks for the prompt reply.