Backend server reorganization

forstmeier commented 7 years ago

This is a thought I had to simplify / optimize the backend server structure to 1) improve logical data flow 2) improve ease of model insertion / conceptualization 3) keep to the prime directive of "plug-n-play"

Currently, there are a variety of structs and sub-structs embedded into the backend server and this would theoretically slim them down to several fewer. Note that this change would have broader ramifications to the current build so this may not be part of the beta rollout.

This is a sketch as to how the hierarchy would sit:

ArchRepo (remains the same in essence, part of a map on the BackendServer w/ mutex locks)
|_ Client (same as currently implemented)
|_ ArchHive
    |_ Confusion matrix
    |_ Etc. (other tools for evaluating progress)
    |_ Bender (moving into the scope of the ArchHive, provides logic to mix/weight results)
    |_ ArchModel (this would be a slice of the ArchModel struct)
        |_ Blender (additional layer of weighting results)
        |_ Model (just like currently implemented)
            |_ Conflation logic (moved to be directly inside the model to manipulate data)
            |_ Scenarios (for the front-end filtering which would be referenced by the FrontendServer)
            |_ MemSQL insertion logic (possibly, IDK if this is the best spot for it)

The takeaway here is that as much model-related logic is clustered into that particular ArchModel struct.

Here are some methods that would exist on the individual structs:

(am *ArchModel) Normalize - used in building the "web" out of the required JSON objects
- likely generates a HeuprObject which is fed into Learn/Predict
- HeuprObject would be a replacement to the ExpandedIssue generalized struct
(ah *ArchHive) / (am *ArchModel) Blend - wrappers to engage the Blender logic

Note that this issue likely supersedes #15

Note that these comments aren't necessary directly related to the architectural rebuild outlined here but do touch on the "updatability" of the data we are working with (which may ultimately need to be reflected in our database design).

Here are some whiteboard outlines of the proposal:

Additionally, here's another outline with some added broad structural detail:

See old txt files in comments

taylormike commented 7 years ago

ArchRepo.go.txt

taylormike commented 7 years ago

ArchRepo_idea.go.txt

forstmeier commented 7 years ago

One other thing to consider about this particular restructuring is because individual models may in the future utilize different training features other than issues/pulls, this will have to be reflected in the section of the logic that handles data collection directly - ingestor/; currently the database and there related collection assets only handle issues/pulls so this would likely need to be updated eventually as well.

heupr / core

Backend server reorganization #18