Store finished simulation calculations (Min, Max, Mean) with Serialized JSON model

More bad ideas and annoying requests from me :)

TL;DR = As a user of pyfair I'd like to have the results of calculated nodes from a simulation stored in the Serialized JSON model so that I can yeet it into DocumentDB and marvel at my pseudoscientific riskiness

On a serious note, this may be a bit hard, but I would like a way to store the calculated values in various DataFrame math operators in the JSON model output. Things in Min, Max, Mean, Mode, etc.

For my purposes, I am able to supply data to all of the "child nodes" and let pyfair calculate VULN, TEF, LEF, LM and ultimately, Risk for me. This also means I know exactly what I need to pull out of the DataFrame returned by .export_results(), it's hacky but it's really fast because Pandas is awesome.

def simulations():
    '''
    HEAVILY TRUNCATED...
    '''
    # Run the simulation and for the Specific Threat Community
    tmodel = malwareModel.calculate_all()

    # In this section we will write the Model and the Simulation Results to DocDB
    tcomModelJson = json.loads(malwareModel.to_json())

    # Write out a DF and perform a Mean, Min and Max Calculations on the colums of the calculated
    # and add these into the the JSON payload to keep the model inputs and outputs together
    tcomModelDf = tmodel.export_results()

    # MAX
    tcomModelJson['MaxRisk'] = int(tcomModelDf['Risk'].max())
    tcomModelJson['MaxLEF'] = float(tcomModelDf['Loss Event Frequency'].max())
    tcomModelJson['MaxTEF'] = float(tcomModelDf['Threat Event Frequency'].max())
    tcomModelJson['MaxVuln'] = float(tcomModelDf['Vulnerability'].max())
    tcomModelJson['MaxLM'] = int(tcomModelDf['Loss Magnitude'].max())

    # MIN
    tcomModelJson['MinRisk'] = int(tcomModelDf['Risk'].min())
    tcomModelJson['MinLEF'] = float(tcomModelDf['Loss Event Frequency'].min())
    tcomModelJson['MinTEF'] = float(tcomModelDf['Threat Event Frequency'].min())
    tcomModelJson['MinVuln'] = float(tcomModelDf['Vulnerability'].min())
    tcomModelJson['MinLM'] = int(tcomModelDf['Loss Magnitude'].min())

    # MEAN
    tcomModelJson['MeanRisk'] = int(tcomModelDf['Risk'].mean())
    tcomModelJson['MeanLEF'] = float(tcomModelDf['Loss Event Frequency'].mean())
    tcomModelJson['MeanTEF'] = float(tcomModelDf['Threat Event Frequency'].mean())
    tcomModelJson['MeanVuln'] = float(tcomModelDf['Vulnerability'].mean())
    tcomModelJson['MeanLM'] = int(tcomModelDf['Loss Magnitude'].mean())

Looking at the code in model.py looks like you can probably do this, but would need a check for to make sure calculate_all() was called and find a more dynamic way to bring in only calculated nodes as everything supplied would already be in there. Though, there is a case to be made to populated everything as your inputs will obviously always have a different ouput.

My use case is now I can store all of the simulation data (uuid, seed, data, supplied fields) if I ever wanted to rerun simulations and to measure them overtime, for example I can measure macro-trends of our Resistance Strength or Contact Frequency over time for specific Apps / Business and put them into a heat map because that's what risk is right? And on the flipside - having the mean/max/min/mode/etc of the simulation stored within the JSON payload allows for similar analysis, data warehousing/data lake/BI and some post-simulation use cases such as comparing the risk to revenue contributions and combining with radically different models (i.e. run simulations where SLEM/SLEF is modeled on ransom payouts and another modeled on punitive fines / lawsuits)

Took a stab at mocking this up in your to_json(self) method - what is the best way to compile from source @theonaunheim I can take a stab at a PR if you 1) think this is cool 2) tell me how to implement some if / else to check if the model was calculated at all.

def to_json(self):
    """Dump the model as JSON string
    TRUNCATED AND EXISITNG COMMENTS REMOVED
    """
    data = {**self._data_input.get_supplied_values()}
    # Add a check here to see if the model was calculated???
    df = self._model_table

    data['name'] = str(self._name)
    data['n_simulations'] = self._n_simulations
    data['random_seed'] = self._random_seed
    data['model_uuid'] = self._model_uuid
    data['type'] = str(self.__class__.__name__)
    data['creation_date'] = self._creation_date
    # More new stuff!
    data['max_risk'] = int(df['Risk'].max())
    data['max_loss_magnitude'] = int(df['Loss Magnitude'].max())
    ##----continue to do math!##

    json_data = json.dumps(
        data,
        indent=4,
    )
    return json_data

Hive-Systems / pyfair

Store finished simulation calculations (Min, Max, Mean) with Serialized JSON model #39