Closed justmark closed 3 years ago
Hi @justmark,
Any idea what I've done wrong?
I don't think you tried anything wrong actually 😉. moongoon
is quite opinionated regarding ObjectId
properties and it assumes that the model field _id
will always be of that type.
The problem with using $group
here is that it will return a result that will not comply with the defined Model shape, especially since _id
will be an object instead of an ObjectId.
It would be quite complicated to change that since working with union of possible id values would be cumbersome.
The solution would be to use the underlying driver (cryomongo) syntax to query the database and return values.
I've prepared a small self-contained example that should illustrate what I mean below
require "moongoon"
Moongoon.connect("mongodb://localhost:27017", database_name: "test_db")
# Define a regular model - but without the aggregation_pipeline macro.
class MyModel < Moongoon::Collection
collection "models"
property name : String
end
# Clear the db.
MyModel.clear
# Insert models.
[
{name: "one"},
{name: "two"},
{name: "two"},
{name: "three"},
{name: "three"},
{name: "three"},
].map { |model| MyModel.new(**model).insert }
# Use the `.collection` method to access the cryomongo collection object.
# (See: https://elbywan.github.io/cryomongo/Mongo/Collection.html)
# Then call `.aggregate` to perform the aggreation query.
cursor = MyModel.collection.aggregate(
pipeline: [
{
"$sort": {"name": 1},
},
{
"$group": {
"_id": {
"name": "$name",
},
"data": {
"$addToSet": "$$ROOT",
},
},
},
]
)
# Put the results inside an array.
# Here the results are of type BSON, but it is possible to type them (we will do that later on).
bson_results = cursor.try &.to_a
# Print the results.
puts bson_results.to_pretty_json
# => [
# {
# "_id": {
# "name": "one"
# },
# "data": [
# {
# "_id": {
# "$oid": "6018225687abcc60a186f4ff"
# },
# "name": "one"
# }
# ]
# },
# {
# "_id": {
# "name": "two"
# },
# "data": [
# {
# "_id": {
# "$oid": "6018225687abcc60a186f500"
# },
# "name": "two"
# },
# {
# "_id": {
# "$oid": "6018225687abcc60a186f501"
# },
# "name": "two"
# }
# ]
# },
# {
# "_id": {
# "name": "three"
# },
# "data": [
# {
# "_id": {
# "$oid": "6018225687abcc60a186f502"
# },
# "name": "three"
# },
# {
# "_id": {
# "$oid": "6018225687abcc60a186f504"
# },
# "name": "three"
# },
# {
# "_id": {
# "$oid": "6018225687abcc60a186f503"
# },
# "name": "three"
# }
# ]
# }
# ]
# Access an element field.
# Casting with .as(BSON) is needed to parse values, just like JSON::Any.
puts bson_results.try &.map &.["_id"].as(BSON)["name"]
# => ["one", "two", "three"]
# As casting can be cumbersome, it is also possible to type the results.
# In order to do so, we need to define a model matching the aggregation result shape.
class MyAggregatedModel
# We need to include this module to enable serialization from mongodb.
include BSON::Serializable
record Id, name : String do
include BSON::Serializable
end
record DataElement, _id : BSON::ObjectId, name : String do
include BSON::Serializable
end
property _id : Id
property data : Array(DataElement)
end
# Replaying the query since the previous cursor has been exhausted.
cursor = MyModel.collection.aggregate(
pipeline: [
{
"$sort": {"name": 1},
},
{
"$group": {
"_id": {
"name": "$name",
},
"data": {
"$addToSet": "$$ROOT",
},
},
},
]
)
# Use .of(_) to cast values automatically.
aggregated_models = cursor.try(&.of(MyAggregatedModel).to_a) || [] of MyAggregatedModel
pp aggregated_models
# => [#<MyAggregatedModel:0x111cea120
# @_id=MyAggregatedModel::Id(@name="one"),
# @data=
# [MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 101]),
# @name="one")]>,
# #<MyAggregatedModel:0x111cea0c0
# @_id=MyAggregatedModel::Id(@name="two"),
# @data=
# [MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 102]),
# @name="two"),
# MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 103]),
# @name="two")]>,
# #<MyAggregatedModel:0x111cea080
# @_id=MyAggregatedModel::Id(@name="three"),
# @data=
# [MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 105]),
# @name="three"),
# MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 104]),
# @name="three"),
# MyAggregatedModel::DataElement(
# @_id=
# BSON::ObjectId(
# @data=Bytes[96, 24, 37, 54, 221, 231, 147, 231, 191, 80, 216, 106]),
# @name="three")]>]
# And to access an element field this time it is much easier.
puts aggregated_models.map &._id.name
# => ["one", "two", "three"]
Thanks for the prompt reply. I was already looking at using the Cryomongo shard after running into this issue.
I'll give your suggestion a spin.
Thanks, appreciate your help.
This is working perfectly.
Hi,
Further to the above, have run into an issue when trying to deserialize and a field isn't present. The model is:
class FivesAggregatedModel
include BSON::Serializable
record Id, md : Int32 do
include BSON::Serializable
end
record DataElements, _id : BSON::ObjectId,
charge_phase : Int32,
include BSON::Serializable
end
property _id : Id
property data : Array(DataElements)
end
When running this I end up with the following error:
Unhandled exception: Unable to deserialize key (charge_phase) having value () of type (Nil) belonging to type 'FivesAggregatedModel::DataElements'. Expected type(s) '[Int32]'. (Exception)
from src/models/fives_aggregated.cr:19:7 in 'initialize'
from src/models/fives_aggregated.cr:19:7 in 'new'
from src/models/fives_aggregated.cr:19:7 in 'from_bson'
from lib/bson/src/bson/ext/array.cr:4:5 in 'from_bson'
from src/models/fives_aggregated.cr:2:5 in 'initialize'
from src/models/fives_aggregated.cr:2:5 in 'new'
from src/models/fives_aggregated.cr:2:5 in 'from_bson'
from lib/cryomongo/src/cryomongo/cursor.cr:174:7 in 'next'
from /usr/share/crystal/src/iterator.cr:488:7 in 'to_a'
from src/analyze.cr:12:38 in 'start'
from src/app.cr:62:1 in '__crystal_main'
from /usr/share/crystal/src/crystal/main.cr:110:5 in 'main_user_code'
from /usr/share/crystal/src/crystal/main.cr:96:7 in 'main'
from /usr/share/crystal/src/crystal/main.cr:119:3 in 'main'
from __libc_start_main
from _start
from ???
This is only happening when the charge_phase field isn't present in a document (which is a valid situation, data wise). I was going through your BSON project as well to see if there was a way to deal with this condition. Any ideas?
Thanks, Mark
Hi @justmark,
Could you try marking the field as nilable?
charge_phase : Int32?
@elbywan That did the trick. Sorry I missed that :)
Hi,
Firstly, thanks for all the work on this project. We're looking at transitioning to Crystal for some work, and this shard is making that possible.
I seem to have run into an aggregation issue though.
I have the following pipeline defined:
This results in _id being an Int rather than an ObjectId. The query works fine in the MongoDB shell. Unfortunately it generates an exception:
Unhandled exception: Unable to deserialize key (_id) having value (BSON(@data=Bytes[13, 0, 0, 0, 16, 109, 100, 0, 212, 7, 241, 95, 0])) and of type (BSON) belonging to type 'Fives'. Expected type(s) '[BSON::ObjectId]'. (Exception)
Any idea what I've done wrong?
Thanks, Mark