Open pkajaba opened 6 years ago
As far as the relevancy of the components goes there is very little dead code.
sounds really good, but
Alternate recommendation - Are being moved outside for reasons other than refactoring
what are those reasons?
companion recommendation - needs to stay, driven by the same component as outliers
can't we extract it into the lib, so outliers and companion will just reuse it?
I have one question. What actually Kronos is? Alternate recommendation
+ Outlier recommendation
+ companion recommendation
together? Some diagrams how analytics pipeline is currently running would be great.
Btw, I have another proposal. We could unify naming because at this point there folder structure: analytics_platform/kronos
and inside this path there apollo
, gnosis
, pgm
and softnet
+ src
folder.
I am sure that you can map these codenames to real components, but we should make better documentation or just rename it. I prefer creating documentation since those names are cool :-).
what are those reasons?
Alternate recommendations don't actually use the PGM structure, rather are based off a Jaccard distance metric based on the tags. However right now we can't increase the tag count for the similarity metric because the PGM cannot accommodate so many tags on a single package. By moving it outside the PGM we'll draw it from the unfiltered package tag map(containing more than four tags per package) enabling better calculation of the similarity score and in turn of alternates.
What actually Kronos is? Alternate recommendation + Outlier recommendation + companion recommendation together? Some diagrams how analytics pipeline is currently running would be great.
Yes.
gnosis - This is generation of reference architecture. softnet - the packages are added as leaf nodes to the reference architecture graph, also contains something called a similarity dict, used to drive outliers. pgm - The actual pomegranate model, it is trained using what we generated in gnosis and softnet.
There's some documentation that I wrote as a part of knowledge transfer sessions here- https://docs.google.com/document/d/1f6dgwvf44kTTbZ1ascvbcxZoTvC4wVq7K5e6PkeHWJo/edit
EDIT: Actually, the diagram in it is not all too accurate. Need to fix it.
@rootAvish Great resource! Thank you. However, let's migrate information from the document into a repository, so it will be in one place.
This task is created after agreement on Mattermost channel. me @rootAvish and @sara-02 agreed that there is way too much functionality inside https://github.com/fabric8-analytics/fabric8-analytics-stack-analysis repository.
My suggestion would be to 1) outline what components are inside, 2) consider whether those components are still relevant for us, 3) rearrange relevant components and 4) refactor these components.
I will be thankful for any input especially from @rootAvish and @sara-02 since they are specialists for this repository right now.