Closed github-germ closed 3 years ago
Btw, the same search
and select
count comparison for events matches fine.
Are you 100% certain that the user the query in PyMISP can see all these attributes? If not, the difference makes sense.
Same if you have orphaned attributes in your MISP instance.
Please find out a bunch of attributes that aren't returned by the search query, and check if the same user you're using to do the query can see them on the web interface.
admin
role "see" everything?The user is admin
.
DING DING DING... I think the diff are objects. So I will have the PyMISP script also grab search(controller='objects',...)
. I'll report back.
I don't think controller='objects'
exists. In order to count the attributes, you can also only select attributes with object_relation
empty or null, that will skip the object attributes.
Yes, I see now...
elif controller == 'objects':
raise PyMISPNotImplementedYet('Not implemented yet')
OK, I'll play more.
OK... So if my use case is to purge
all ingested indicator things that have not be revised since some date (e.g. 365d
), then are there just 3 things: events
, attributes
, objects
? I forgot about objects
(we do have them). Any others?
Does this basic plan make sense?
delete_event
with old timestamps
delete_attribute
with old timestamps
delete_object
with old timestamps
The outcome is all events
that remain have only attributes
and objects
that are newer than the purge date. This will be run nightly (the first batch being the largest to process).
I will also keep a JSON file for each archiving what gets deleted each night and age-out the archive, e.g. every 30 days (as a safety).
Yes, that should work, except that we do not have a way to search for objects directly. What you will need to do is either to delete a whole event, or delete the attributes and afterwards search for empty objects, and then delete them.
Let me see if I am following you...
search(controller='attributes',...)
includes the attributes that are members of objects?afterwards search for empty objects, and then delete them
In your case, it is not blocking for you, as it just mean you have empty objects in your database, but we should have a way to trigger that cleanup at the instance level.
Does it make sense to close this issue and spawn a new one requesting a way to access objects, at the very least delete empty objects? Thanks.
yes, please open an issue on MISP to have a call that removes empty objects. I'm not sure how it is doable, but that's the way to go.
Closing this issue since we now have #491.
Hmmm, I think you wanted this new one on MISP/MISP, not MISP/PyMISP. Can you move it over?
There were still non object attributes that pymisp.search
were not returning within a timestamp
(from, to) range. Turns out they all had distribution=5
, i.e. inherit event
.
So is it true that search does not return attributes that are inherited? And if so, then if we are looking to purge attributes that are old (e.g. a year), then is there any way to delete those via PyMISP?
They should be returned (inherit is the default). What was the distribution of the event?
They should be returned (inherit is the default). What was the distribution of the event?
distribution=5
and those 27k attributes are not returned via search
.Thx.
I'm talking of the distribution of the event the attributes are part of. The attributes inherit the distribution of the events they're part of, and that may (or may not) be the problem. It's just a data point I'd like to have.
the event the attributes are part of
OK, the 27K attributes older than 365 days, which search
is not returning, are children of 94 events. Those events all have distribution, sharing_group_id, locked
set to 0
, the same local org
. However, interesting values found in orgc
:
13 of the events have event.orgc_id == 32
MariaDB [misp]> select * from roles where id=32 \G
Empty set (0.01 sec)
MariaDB [misp]>
81 of the events event.orgc.id == 3
MariaDB [misp]> select name from roles where id=3;
+------+
| name |
+------+
| User |
+------+
1 row in set (0.00 sec)
MariaDB [misp]>
User
role is not visible in the WebUI List Organisations page.ok, there seems to be a lot of confusion in here.
Role != orgc.
select * from roles where id = 32
This will return the description of role ID 32 (/roles/index via your UI to see what roles are).
Roles control the ACL when it comes to functionalities a user has access to. For example, can a user with a given role attach tags to data? Can they get access to the audit logs? Can they administer users of their own organisation?
orgc is the creator organisation of an event. It is an alias for the organisations (you can view these via /organisations/index in your UI).
if you wanted to see the organisation that corresponds to an event's creator org ID (event.orgc.id), you have to run this query:
select name from organisations where id=32;
Of course I know the diff. Multitasking error: pretty embarrassing. SORRY that I made you lecture me on the diff when I knew that!! Let's try that again...
Let me know what else you'd like to see please.
These 2 orgs are both NOT local. Note that we have 25 non-local orgs:
MariaDB [misp]> select count(*) from organisations where local=0;
+----------+
| count(*) |
+----------+
| 25 |
+----------+
13 of the events have event.orgc_id == 32
MariaDB [misp]> select id,name,local from organisations where id=32;
+----+-------------------+-------+
| id | name | local |
+----+-------------------+-------+
| 32 | Synovus Financial | 0 |
+----+-------------------+-------+
81 of the events event.orgc.id == 3
MariaDB [misp]> select id,name,local from organisations where id=3;
+----+-------+-------+
| id | name | local |
+----+-------+-------+
| 3 | CIRCL | 0 |
+----+-------+-------+
Non local organisations can have events on your instance, that is absolutely no issue and the events shouldn't behave any differently than the other events. If they do, something really funky is going on.
Basically if I create an event on the CIRCL MISP instance and you pull it from there / via our OSINT feed / via a 3rd party, then MISP will receive an event that was originally created by an organisation that your instance doesn't know. Assuming that you are org.id = 1, if you pulled an event created by us in any of the above mentioned ways, MISP would do the following:
As for the difference between the SQL and the restSearch counts: Something that can cause this is orphaned attributes. Those will never be returned by restSearch. Could this be the issue? Did you manually delete stuff from the DB?
Cool. That makes sense.
Our Use Case is a Purge Process that deletes all events, attributes, and objects that have a timestamp older than a year (for example).
I've identified that after deleting all the candidate aged-out events and attributes, that there are still 27k attributes remaining that were not returned by pymisp.search
which appear to fall into two categories:
MISPObjects
See https://github.com/MISP/MISP/issues/5380 where I've requested a method to delete those via PyMISP.As for the difference between the SQL and the restSearch counts: Something that can cause this is orphaned attributes. Those will never be returned by restSearch. Could this be the issue? Did you manually delete stuff from the DB?
Empty MISP objects can't be deleted all that easily (if you're after a purge tool that is), so I think I misunderstood the other issue.
non local events can absolutely be deleted locally.
Very curious, will see if I can reproduce it tomorrow, it's time to hit the sack over here soon ;)
Thanks. Catch you manana... Will be interested to see how you can delete the non-local content when my search
cannot find them. Sleep well !!
OK, I've been playing around with this over the weekend:
If I set a timestamp on attributes/restSearch it will add two separate where clauses:
the timestamp range will be checked against the attribute timestamp as expected but also against the event timestamp
What this means with a simple example:
13 months ago I created an event, Event E1 with Attributes A1, A2, and an Event E2 with the attributes A3, A4. The timestamps of both events and all 4 attributes are set to 13 months ago.
searching for the timeframe of 12 months and older would return all 4 attributes
7 months ago, I decided to add attribute A5 to E2. Adding an attribute to an event updates its timestamp, meaning that now we have E2.timestamp and E2.A5.timestamp being 7 months ago, with All of E1 and A3+A4 still being on 13 months ago as expected.
searching however no longer returns A3 and A4 in our previous search query as the containing event is now newer than 12 months old.
So indeed this will cause the discrepancy that you're seeing - it's inline with the intent of the timestamp filter's functionality, but it indeed shows that we're lacking a filter here. So what I'll do next:
Add a new filter option (attribute_timestamp) that will purely search based on the attribute timestamp ignoring the event level. We already have this on the event level via event_timestamp anyway.
I've also added a new tool in MISP that will allow you to inspect the queries that MISP is executing so that you can compare it more easily to direct SQL queries - I highly encourage you to use this and report back anything weird you're seeing in there.
The way it works:
Simply pass the sql:1 parameter along with your API request to get the last 200 SQL queries included in the output. Passing sql:2 will omit the actual results and just return the SQL queries.
For example:
POST to /attributes/restSearch BODY:
{
"returnFormat": "json",
"timestamp": [1567296000, 1573222784],
"sql": 2
}
Yes! This does explain what we are experiencing. Really appreciate you working it through. Let me me know how we can get the new filter option and MISP SQL query tool here to try. Many Thanks!!
They should both be there now 😉
On Mon, Nov 11, 2019, 16:29 github-germ notifications@github.com wrote:
Yes! This does explain what we are experiencing. Really appreciate you working it through. Let me me know how we can get the new filter option and MISP SQL query tool here to try. Many Thanks!!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MISP/PyMISP/issues/490?email_source=notifications&email_token=AA37VQAYYN4GBL47TFE4X6LQTF26JA5CNFSM4JHJZQA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDXFQZI#issuecomment-552491109, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA37VQDXDBXBHX6UF4SED7DQTF26JANCNFSM4JHJZQAQ .
We're on 2.4.117. It's there? :-)
No, it's not ;)
Afaik, it will be available on 2.4.119 (118 was released last friday).
We will be upgrading to 2.4.118 in the next week or in our Dev Lab (Production lags behind that). Once .119 is out, we can resume this one. Thanks again!!
Interesting difference in
attribute
count when usingpymisp.search
while validating with a direct MySQLselect
. Can you help us understand how to get the two to match, e.g. is there some additional args needed with thesearch
?(Note: this was observed as we are implementing an Event/Attribute purge process as discussed in https://github.com/MISP/MISP/issues/5300:
Best method to "purge" old MISP data
)Thanks!!
Our Test Case
ExpandedPyMISP
SQL
MariaDB [misp]> select count() from attributes where timestamp >= 1262304000 and timestamp <= 1540857600 and deleted=1; +----------+ | count() | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
MariaDB [misp]> select count() from attributes where timestamp >= 1262304000 and timestamp <= 1540857600 and deleted=0; +----------+ | count() | +----------+ | 49154 | +----------+ 1 row in set (3 min 4.41 sec)
MariaDB [misp]>