change-metrics / monocle

Monocle helps teams and individual to better organize daily duties and to detect anomalies in the way changes are produced and reviewed.
https://demo.changemetrics.io/
GNU Affero General Public License v3.0
362 stars 56 forks source link

Support a groups attribute in any Ident attribute #1084

Closed morucci closed 6 months ago

morucci commented 7 months ago

The groups definition in the configuration file is mainly used to support queries filtered by group. Before performing requests to ElasticSearch the Monocle API retrieves identities from group definition to build ElasticSearch queries.

As the groups definition is not reflected into the database, the capability to perform aggregated queries by group on ElasticSearch is not possible.

The proposal here is to add a groups attribute for any Ident fields. Monocle will build the Ident's group list based on the configuration file before indexing a document in the database. The Janitor updateIdents command will also support that new attribute.

As a user could be part of multiple groups, then the groups attribute will be a terms array. ElasticSearch terms aggregation could then be used to query, for instance, the list of groups that commented or created changes during a time slice.

leonid-deriv commented 7 months ago

I think two "groups" fields should be implemented - one for author.muid and one for on_author.muid. Multiple group membership is not a problem. So we may have :

morucci commented 7 months ago

I agree and to that extend all fields of type AuthorIndexMapping https://github.com/change-metrics/monocle/blob/master/src/Monocle/Backend/Index.hs#L47 must own a groups field.

leonid-deriv commented 7 months ago

Yes, that makes sense

leonid-deriv commented 6 months ago

Any updates here? no pressure :). Ok, I see the PR is opened. 👍

leonid-deriv commented 5 months ago

In order to take advantage of this new feature I need to rebuild the index with the correct settings in the config file - idents section? Right?

morucci commented 5 months ago

Yes you need to set the proper idents config, then run https://github.com/change-metrics/monocle?tab=readme-ov-file#apply-idents-configuration

morucci commented 5 months ago

@leonid-deriv we experienced an issue with that feature when we upgraded the project's demo deployment. The command above was also failing. We just merged https://github.com/change-metrics/monocle/pull/1105 that solved the issue.

leonid-deriv commented 5 months ago

Yes you need to set the proper idents config, then run https://github.com/change-metrics/monocle?tab=readme-ov-file#apply-idents-configuration

When we apply this - does it mean that all previous records will be updated so I do not need to re-index all data?

morucci commented 5 months ago

Exact, that's the purpose of the command.

leonid-deriv commented 5 months ago

thank you, this is perfect. Will be doing upgrade this weekend. Just curious how long it will take to update 500K records?

morucci commented 5 months ago

It is "quite fast". It take 7 minutes with 5M docs from my side.

leonid-deriv commented 5 months ago

I am running the idents update command and have a problem

+] Building 0.0s (0/0)                                                                                                                                                                                                                                     docker:default
2024-01-26 09:26:30 INFO    Monocle.Backend.Janitor:49: Janitor will process changes and event {"workspace":"demo","changes":37432,"events":512941}
2024-01-26 09:27:18 INFO    Monocle.Backend.Janitor:51: Updated changes {"count":876}
monocle: esAdvance: Original error was: Error in $.hits.hits[2905]['_source']: parsing Monocle.Backend.Janitor.EChangeEventAuthors(EChangeEventAuthors) failed, key "id" not found Error parse failure was: Error in $: key "status" not found, req: "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFkJtNFFLQW82UjhDR21YRVRpTnNuVXcAAAAAAAAA8xZmMDFyRzVFeVJjR2VVUFVZMGNxZ0t, resp: {"_scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFkJtNFFLQW82UjhDR21YRVRpTnNuVXcAAAAAAAAA8xZmMDFyRzVFeVJjR, tb: [("throwError",SrcLoc {srcLocPackage = "monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi", srcLocModule = "Monocle.Effects", srcLocFile = "src/Monocle/Effects.hs", srcLocStartLine = 403, srcLocStartCol = 15, srcLocEndLine = 403, srcLocEndCol = 27}),("runBHIOSafe",SrcLoc {srcLocPackage = "monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi", srcLocModule = "Monocle.Effects", srcLocFile = "src/Monocle/Effects.hs", srcLocStartLine = 445, srcLocStartCol = 3, srcLocEndLine = 445, srcLocEndCol = 14})]
CallStack (from HasCallStack):
  error, called at src/Relude/Debug.hs:289:11 in relude-1.2.0.0-Jiwa4gfuZvkK1snRof3V:Relude.Debug
  error, called at src/Monocle/Effects.hs:420:7 in monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi:Monocle.Effects

Actually, there are no type:"CachedAuthor" events in the index

After running it for the second time I got this error. And as you see updated changes 0, so looks like records are updated. Also I have noticed that some recent events are not updated. Any reason? Actually, the pattern is not that obvious - there are some old events which were not updated.

[+] Building 0.0s (0/0)                                                                                                                                                                                                                                     docker:default
2024-01-26 09:29:30 INFO    Monocle.Backend.Janitor:49: Janitor will process changes and event {"workspace":"demo","changes":37432,"events":512944}
2024-01-26 09:30:07 INFO    Monocle.Backend.Janitor:51: Updated changes {"count":0}
monocle: esAdvance: Original error was: Error in $.hits.hits[3398]['_source']: parsing Monocle.Backend.Janitor.EChangeEventAuthors(EChangeEventAuthors) failed, key "id" not found Error parse failure was: Error in $: key "status" not found, req: "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFkJtNFFLQW82UjhDR21YRVRpTnNuVXcAAAAAAAABwBZmMDFyRzVFeVJjR2VVUFVZMGNxZ0t, resp: {"_scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFkJtNFFLQW82UjhDR21YRVRpTnNuVXcAAAAAAAABwBZmMDFyRzVFeVJjR, tb: [("throwError",SrcLoc {srcLocPackage = "monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi", srcLocModule = "Monocle.Effects", srcLocFile = "src/Monocle/Effects.hs", srcLocStartLine = 403, srcLocStartCol = 15, srcLocEndLine = 403, srcLocEndCol = 27}),("runBHIOSafe",SrcLoc {srcLocPackage = "monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi", srcLocModule = "Monocle.Effects", srcLocFile = "src/Monocle/Effects.hs", srcLocStartLine = 445, srcLocStartCol = 3, srcLocEndLine = 445, srcLocEndCol = 14})]
CallStack (from HasCallStack):
  error, called at src/Relude/Debug.hs:289:11 in relude-1.2.0.0-Jiwa4gfuZvkK1snRof3V:Relude.Debug
  error, called at src/Monocle/Effects.hs:420:7 in monocle-1.11.0-2ShLfwfFjyo1H0paLD2SMi:Monocle.Effects
leonid-deriv commented 5 months ago

An addition - it took some time and majority of events are "done" but unfortunately not all :(... I cannot see the pattern. This is a bit worrying because you cannot trust the reports. Please let me know if I need to provide you extra information

leonid-deriv commented 5 months ago

should I re-run the ident command? Ah, Ok, it is not merged yet.

morucci commented 5 months ago

@leonid-deriv yes you can re-run, the change is merged.