Question: Use in NoSQL ( MongoDB ) queries

lantier commented 4 years ago

Hey. Here in our company we have been forking this project for a year to extend queries to being used in MongoDB Databases as simple queries and aggregation pipelines.

The project owners and actual contributors have any interests in officially extend the app to this point? We can contribute with our modifications as a start.

albertodonato commented 4 years ago

Hi, it's great to know that query-exporter is being used!

It would be nice to know how the project was forked to support mongodb. The reason I used sqlalchemy as db engine is that it supports quite a lot of database servers.

I guess to support mongo you actually had to extend the DataBase class to use a different engine?

lantier commented 4 years ago

Actually we forked from 1.8.1 version (yes...we're quite outdated) and after a couple of fixes that you guys provided in the meantime we though: "Well, maybe if the feature we want was official we didn't need to backport anything"

So, based in your question. yes, we extended Database class as following to support count YAML key as simple queries and aggregate as aggregation pipelines:


class MongoDataBase(_DataBase):
    """A database to perform Queries."""

    _conn: Union[AsyncIOMotorClient, None] = None
    _logger: logging.Logger = logging.getLogger()
    _pending_queries: int = 0

    async def connect(self, loop: Optional[asyncio.AbstractEventLoop] = None):
        """Connect to the database."""
        if loop:
            self._conn = AsyncIOMotorClient(self.dsn, io_loop=loop)
        else:
            self._conn = AsyncIOMotorClient(self.dsn)

        try:
            await self._conn.server_info()
        except Exception as error:
            self._conn = None
            raise self._db_error(error)
        self._logger.debug(f'connected to database "{self.name}"')

    async def execute(self, query: Query) -> List[MetricResult]:
        """Execute a query."""
        if not self.connected:
            await self.connect()

        self._logger.debug(
            f'running query "{query.name}" on database "{self.name}"')
        self._pending_queries += 1
        self._conn: AsyncIOMotorClient
        try:
            db = self._conn[query.sql['database']]
            coll = db[query.sql['collection']]

            if 'count' in query.sql:
                labels_keys = list(query.sql.get('labels', {}).keys())
                labels_values = [
                    query.sql['labels'][key] for key in labels_keys
                ]
                metrics_names = [metric.name for metric in query.metrics]

                count = await coll.count_documents(
                    literal_eval(query.sql['count']))
                counts = [count] * len(query.metrics)

                return query.results(
                    QueryResults(
                        metrics_names + labels_keys,
                        [tuple(counts + labels_values)]))
            else:
                cursor = coll.aggregate(literal_eval(query.sql['aggregate']))
                return query.results(await QueryResults.from_cursor(cursor))

        except Exception as error:
            raise self._query_db_error(
                query.name, error, fatal=isinstance(error, FATAL_ERRORS))
        finally:
            assert self._pending_queries >= 0, 'pending queries is negative'
            self._pending_queries -= 1
            if not self.keep_connected and not self._pending_queries:
                await self.close()

`

albertodonato commented 4 years ago

Could you please also provide an example config.yaml for mongodb?

lantier commented 4 years ago

The metrics part is the same, the change is in queries key. For aggregation pipelines:

queries:
    sources_queries_under_review_by_company_source:
        interval: 60
        databases: [mymongo]
        metrics: [sources_queries_under_review_by_company_source]
        sql:
            database: "sources"
            collection: "queries"
            aggregate: |
                [
                    {
                        '$match': {
                            'currentStatus.status': 'UNDER_REVIEW'
                        }
                    }, {
                        '$group': {
                            '_id': {
                                'company': '$company.slug',
                                'source': '$source.name'
                            }, 
                            'sources_queries_under_review_by_company_source': {
                                '$sum': 1
                            }
                        }
                    }, {
                        '$project': {
                            '_id': False, 
                            'company': '$_id.company', 
                            'source': '$_id.source', 
                            'sources_queries_under_review_by_company_source': 1
                        }
                    }
                ]

For simples queries (count):

queries:
  process_failed_count:
    interval: 60
    databases: [my-base]
    metrics: [process_failed_count]
    sql:
      database: XPTO1
      collection: documents
      count: |
        {
          "integrations.xtr.sent": False,
          "status": {"$in": ["reviewed", "audited"]}
        }

As you can see we use Python formatting for Mongo queries, as the default (JS) is not supported

lihaiswu commented 3 years ago

@albertodonato @lantier is it possible to merge the code to support mongo ? I'm using query-exporter to query Postgres currently. It's really a nice tool. But we'll move to mongo soon. Hopefully, mongo queries could be supported by query-exporter as well.

xuanyuanaosheng commented 3 years ago

@albertodonato This is a good idea, But I think it should be use another repo to do this.

Relational database and non relational database are still different

xuanyuanaosheng commented 2 years ago

@albertodonato Any update?

albertodonato / query-exporter

Question: Use in NoSQL ( MongoDB ) queries #66