Velocidex / velociraptor

Digging Deeper....
https://docs.velociraptor.app/
Other
2.95k stars 486 forks source link

start and count in clients() not working correctly #3685

Closed mehmetbarispolat closed 2 months ago

mehmetbarispolat commented 2 months ago

start and count parameters in clients.

I've sent a query like this below.

SELECT os_info.system as OS
FROM clients(start=0,count=1)
ORDER BY _LastSeenAt DESC

But returns all of data.

[
  {
    "OS": "linux"
  },
  {
    "OS": "windows"
  },
  {
    "OS": "linux"
  },
  {
    "OS": "darwin"
  },
  {
    "OS": "windows"
  },
  {
    "OS": "darwin"
  },
  {
    "OS": "linux"
  },
  {
    "OS": "linux"
  },
  {
    "OS": "windows"
  },
  {
    "OS": "linux"
  },
  {}
]
scudette commented 2 months ago

Yes you are right - those fields are no longer used at all. The limits can be set using the VQL LIMIT clause and the count can be set using WHERE

mehmetbarispolat commented 2 months ago

Is there an alternative clause for the start parameter, like OFFSET in SQL?

scudette commented 2 months ago

What are you trying to do? offset and length are used for paging which is usually not necessary in VQL queries.

predictiple commented 2 months ago

I suppose you can do

SELECT count() AS _Count, client_id from clients() WHERE _Count > 50 LIMIT 30
scudette commented 2 months ago

This might not be stable in general so you probably need to order it too

SELECT client_id, count() AS RowID
FROM clients()
WHERE RowID > 50
ORDER BY client_id
LIMIT 30

But the main difficulty with this approach is that VQL does not have an index snapshot so there is no guarantee that one page follows the previous one (a new client can just appear in between the queries).

If it is just used to fill in a GUI pager thats is probably not super important if we lose or repeat a couple of clients between pages but for more critical uses it is best to just get all the clients in their entirety

mehmetbarispolat commented 2 months ago

What are you trying to do? offset and length are used for paging which is usually not necessary in VQL queries.

Actually, I need a pagination.

scudette commented 2 months ago

Why? If you are accessing over AJAX you are probably better off to use the GUI APIs (but these are not supported or stable)

scudette commented 2 months ago

You can always make your own snapshot using write_jsonl() and just page that. This will guarantee consistency if you really need that

mehmetbarispolat commented 2 months ago

The client count might be 10k in my system. So I need a pagination while listing the clients. I don't want to return all of the clients to the frontend. If I can paging using VQL, I want to use VQL.

Also, I've used pyvelociraptor.

scudette commented 2 months ago

Pyvelociraptor uses grpc which is a streaming API. So there is no need to page it because the API will take care of paging by itself. It's fine to fetch all the client records in one query.

If you look at the python code you will see it receives pages already with Max size specified by the API call.