nv-morpheus / Morpheus

Morpheus SDK
Apache License 2.0
347 stars 130 forks source link

[DOC]: DFP Starter Example README.md Needs to be Updated #1713

Open oguzhancelik2425 opened 4 months ago

oguzhancelik2425 commented 4 months ago

How would you describe the priority of this documentation request

High

Please provide a link or source to the relevant docs

https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/examples/digital_fingerprinting/starter/README.md

Describe the problems in the documentation

In this file there are 3 given examples for the DFP pipeline which includes cloudtrail, azure aad, and duo message log use cases. However, when I deep dive into the documentation for running examples, the explanation is somehow poor. In the examples of the uses cases above there are some training and validation dataset samples. When I check for the cloudtrail input validation sample dataset, it already has the ae_anomaly_score and ts_anomaly values, but those are calculated by the model should not be in the training validation dataset I assume. Moreover, when I check the train ae model stage, I cannot see the ae_anomlay_score and ts_anomaly calculation, but these are in the hammah-inference.py module in the /models/validation-scripts/dfp-models/ directory. From those information there is no easy way to understand what is the order, or running the pipeline and scripts, what are the training, validation and inference files, and how those anomaly scores are being calculated?

On the other hand, I realized that some directory paths are not updated yet which causes errors in the CLI. For example: the CLI example here shows that --columns_file=morpheus/data/columns_ae_cloudtrail.txt option should read the feature columns txt file but there is no path from /morpheus/ instead it should be models/data/columns_ae_cloudtrail.txt. Similar problem exists in the rest of the README.md file.

(Optional) Propose a correction

I think this tarted example needs to be reviewed in detail and should be given more details to understand how this model works, what resources need to be used for the given examples and what is the order to run. In my current example that I tried to explain above, I understand that user should run the hammah-20211017.ipynb notebook first right after feature selection with the training dataset, then trained the model here in order to get the ae_anomaly_score and ts_anomaly values via running hammah-inference.py module. Then, the user needs to run the dfp pipeline CLI based on the use case's features with the file which is overridden by the previous hammah-inference.py, in order to get reconstruction_loss, z_loss, scores values etc. If I am wrong it is because of poor documentation of the example, unfortunately.

Code of Conduct

efajardo-nv commented 4 months ago

@oguzhancelik2425 Thank you for submitting this issue. The Starter DFP implementation and documentation are somewhat stale as we decided to focus on single DFP implementation, Production DFP. The two implementations have diverged significantly so we have been encouraging everyone to now start with Production DFP which comes with a bit more complexity but incorporates new Morpheus features and is more scalable.

We plan on having Starter DFP removed for the next release (#1715).

oguzhancelik2425 commented 4 months ago

@oguzhancelik2425 Thank you for submitting this issue. The Starter DFP implementation and documentation are somewhat stale as we decided to focus on single DFP implementation, Production DFP. The two implementations have diverged significantly so we have been encouraging everyone to now start with Production DFP which comes with a bit more complexity but incorporates new Morpheus features and is more scalable.

We plan on having Starter DFP removed for the next release (#1715).

Thanks for the quick response @efajardo-nv. I wonder about the part that returns the anomaly scores in the hammah_inference.py, it shows index 3 to get some values but If I use it, I am getting uniform values for the inference dataset. Could this module be outdated/stale as well like the rest of the starter pipeline? When I removed the [3] from the line I got different anomaly scores for my inference. I wonder is this line mistakenly written or so?

efajardo-nv commented 4 months ago

I wonder about the part that returns the anomaly scores in the hammah_inference.py, it shows index 3 to get some values but If I use it, I am getting uniform values for the inference dataset. Could this module be outdated/stale as well like the rest of the starter pipeline? When I removed the [3] from the line I got different anomaly scores for my inference. I wonder is this line mistakenly written or so?

@oguzhancelik2425 You are correct. That inference script which is run independent of Morpheus has not been kept in sync with the changes in the autoencoder and data file paths. We'll get that updated. Thanks!