Open ericmj opened 9 years ago
I wonder how this should look like. Given the variety of different options we have from the driver, wouldn’t it be better to expose a way to call the driver directly but with the repo's pool?
That might be better, but I don't know what the would look like. We might be dropping the drivers own pooling in the future.
We already inject a __mongo_pool__/0
function to the repo at compile-time, and use it internally for accessing the pool.
Dropping the pooling in the driver would probably mean many changes anyway, and I'm not sure we could retain backwards compatibility when doing that. I think it could be fine to make the __mongo_pool__
function public, probably with a better name, and some documentation.
What do you think?
That sounds like the best way of doing it. We can revisit again when (if) we changing the pooling.
On Thu, Oct 22, 2015 at 9:44 AM, Michał Muskała notifications@github.com wrote:
We already inject a mongo_pool/0 function to the repo at compile-time, and use it internally for accessing the pool. Dropping the pooling in the driver would probably mean many changes anyway, and I'm not sure we could retain backwards compatibility when doing that. I think it could be fine to make the mongo_pool function public, probably with a better name, and some documentation.
What do you think?
— Reply to this email directly or view it on GitHub https://github.com/michalmuskala/mongodb_ecto/issues/45#issuecomment-150247317 .
Eric Meadows-Jönsson
What's the status on this....is this possible yet? Unless I've missed how various MongoDB aggregations can be done, there appears to be quite a bit of variance between what can be accomplished in queries directly against MongoDB vs. through mongo_ecto. It seems that one way to leap past the immediate need to integrate individual aggregation operations into mongo_ecto would be just to be able to send a raw statement to MongoDB. Can this be done?
This is how I do it and I think it's right -- it uses the same pool as used by ecto if I'm understanding the code correctly:
Mongo.update_one(MyApp.Repo.Pool, collection, query, update, opts)
@scottmessinger Hey thanks for the reply. I have read through the mongo_ecto doc, and I confess, I haven't been able to figure out how to do this. Since you have a handle on it can you give me an idea of how to accomplish this MongoDB query through the mongo_ecto driver?
db.snapshots.aggregate( [ { $sort: { env_id: 1, update_timestamp: 1 } }, { $group: { _id: "$env_id", lastSalesDate: { $last: "$update_timestamp" } } } ] )
I think maybe if I could see this example I could extrapolate from that and figure out related queries.
@ericmj i am also looking for run mongo query directly using mongo_ecto driver, i tried a lot to execute join queries using mongo_ecto and not able to figure it out , basically we need it for run join queries using $lookup key, please let me know if you guys have any solution for it.
Thanks in advance.
You need to pass in the mongo pool from your ecto repository. Like this: Mongo.find(MyRepo.mongo_pool, ...)
.
@ericmj is mongo_pool
just a placeholder for the actual pool?
I found this using google and for me (using Ecto 2.0 and mongodb_ecto from ankhers/mongodb_ecto, not sure if this is up-to-date),
Mongo.count(MyApp.Repo.Pool, "mycollection", nil)
works.
@iStefo I don't know, it's possible it works differently in the Ecto 2.0 branch of mongodb_ecto.
@iStefo You will most likely want to use MyApp.Repo.__pool__
. It will return a tuple with the pool in the first element, and the configuration in the second. You can see this being injected here.
The reason you may want to do this instead of just using MyApp.Repo.Pool
is that you can actually configure the pool that this adapter uses. I'm not sure how many people actually do this though.
@ankhers thanks for the explanation! Would it make sense to include this bit in the documentation? I think it would be particularly useful for executing aggregation functions which are (understandably) not implemented in this Ecto adapter.
I don't think so. That function is meant to be for internal use only (hence the leading and trailing double underscore).
I do, however, think it would be beneficial to have something more similar to Ecto.Adapters.SQL.query
. Maybe this adapter can provide a module Ecto.Adapters.Mongo
that has various functions for working with the base Mongo adapter directly for insert, update, aggregate, etc.
Unfortunately because of the differences between SQL and Mongo, I do not think it would be possible to provide a single query
function like the SQL adapter uses.
I don't think so. That function is meant to be for internal use only (hence the leading and trailing double underscore).
It was only proposed as a workaround. We definitely want a function for this, that's why the issue is still open and up for grabs :)
I'm not sure whether the original intent of this issue reflects the current state in the discussion:
As @ankhers noticed, mongoDB can't be used with only one query
function to execute the different types of queries, with aggregation probably being the most common. In fact, all the database operations are modelled as individual methods on the collection object: https://docs.mongodb.com/manual/reference/method/
https://github.com/ericmj/mongodb already does a good job bringing many of these methods to elixir.
Instead of wrapping all these functions, I think it would be preferable to expose the connection pool in a documented manner (and call it a day from the view of this adapter).
That is also a good way to move forward. Would you be able to send in a PR?
Yep, I will try to do it tomorrow.
2 questions:
What name would you think is suitable for exposing the pool? Simple rename of __pool__/0
to pool/0
?
Is there any other use case that requires accessing the pool? Otherwise it would be possible to proxy calls to the mongo driver by using Kernel.apply/3
like this:
def mongo_call(method, args \\ []) when is_atom(method) and is_list(args) do
{pool, _} = __pool__()
apply(Mongo, method, [pool | args])
end
I think this approach is more elegant since it requires less boilerplate for the application developers.
In the above snippet, the user would call mongo_call(:aggregate, ["channels", [%{ "$group" => ... }]])
.
But we could also make this new mongo_call (not sure with the name, btw) method specific for collections and automatically convert from Ecto Schemas to collection names (binaries). This would be more similar to the way the other Repo methods are used.
Direct access to the pool would still be (semi)private to the Repo module, however. I honestly don't know whether this is desirable or not.
With the mongo driver migrating to db_connection it's not as easy as just passing the right pool name. Some pool options need to also be passed so that db_connection can work properly. We probably need a wrapper around the entire interface of Mongo
that will do "the right thing (TM)".
We could call it Mongo.Ecto.Native
or Mongo.Ecto.Raw
.
I see the changes made to the MongoDB module in https://github.com/ericmj/mongodb/compare/emj-dbconn#diff-6592a4605ea4a81c3fe573f99e1a1cb1.
So instead of passing in the pool name, a DbConnection.Conn
is expected.
Has there already been some work done in this project to reflect the upcoming changes?
What do you specifically mean with "doing the right thing"? Taking care of the underlying Mongo
's calling requirement? (Pool vs Conn?)
The changes are incorporated as part of the PR https://github.com/michalmuskala/mongodb_ecto/pull/84.
Passing the pool name as the first argument is completely fine, but we should also be passing the pool implementation (db_connection supports multiple pools) in options and merge default pool, connection, and query timeouts.
The documentation in https://github.com/ericmj/mongodb/blob/0687ef615e3a9a57156118e47dcd615e92b0e8a2/lib/mongo.ex does not yet reflect the changes that are already in the code so I'm not sure how passing generic(?) options to the Mongo
methods works.
Do I understand you correctly in that it would be possible for the user to do
mongo_call(:aggregate, ["channels", [%{ "$group" => ... }], timeout: 2000])
and we would need to merge the :timeout
option with the default options stored in the Repo and pass them down to the original Mongo
method as the last argument?
Mongo.aggregate("channel", [%{ "$group" => ... }], [timeout: 2000, url: ..., adapter: ...])
As long as the convention of options-last holds, this seems like a feasible approach.
To make it easier for us (remove the need of detecting and extracting a Keyword list from the list of arguments), we could change the method signature to mongo_call(method, args, options \\ [])
and then merge the options that are passed in with the module's default options.
The documentation in https://github.com/ericmj/mongodb/blob/0687ef615e3a9a57156118e47dcd615e92b0e8a2/lib/mongo.ex
You are looking at a commit from 2016, check out master instead.
@ericmj thanks, is nicely documented in https://github.com/ericmj/mongodb/blob/master/lib/mongo.ex#L6
Would love to help, let me know if there is anything I can do.
Basically the equivalent of
Ecto.Adapters.SQL.query
. We need to discuss what the API should be like because the driver exposes multiple functions where the SQL drivers only have one.I would be willing to work on this.