Closed miquelduranfrigola closed 9 months ago
Update
I have written a placeholder class S3Logger that will be able to upload lake calculations to S3.
Next steps:
@miquelduranfrigola You solution is robust here. Since the logs will increase, we need to also consider the below points in future:
Q. Why we are not writing directly to the Splunk? Again, I am thinking from the perspective of latency.
Thanks @honeyankit
These are very important points. The way we have set up Splunk is that it will ingest data cumulatively, so, actually, we can overwrite the logs every time we do a calculation, and Splunk will just ingest the updated log. This means that, from an S3 perspective, costs will be stable and negligible. I hope this makes sense?
In terms of latency, fortunately it is not a concern. We will use the Splunk server mainly for stats purposes (for example, to know how much use we've done of each model, or to provide usage statistics to our funders). So latency is really not a constraint. We will produce this reports on a monthly basis, or even every three months.
The reason why we are using S3 as an "intermediate" is because folks at Splunk already set up the tool for us to read from an "always accessible" folder structure. I don't know how this could be adapted to GitHub Actions. Perhaps more importantly, many times we make calculations from outside GitHub Actions, so in this case S3 becomes a centralized place to deposit the log data.
Update: Harvard T4SG volunteers took over this task and they have delivered a neat solution.
Basically, Ersilia now contains a --track
flag that allows us to upload files to an S3 bucket which is eventually monitored by Splunk.
I am closing this issue for now.
Background
We have set up a physical Splunk server with the goal of keeping track of all model precalculations. This is a critical step as we scale up the Ersilia Model Hub. For now, the server is not publicly accessible. Please reach out to @miquelduranfrigola if you want to know more.
Ersilia Model TA App
The Ersilia Model TA App is the main monitoring app for the models precalculations in Ersilia. The app ingests data from a remote device. A receiving port (9997) was opened on the all-in-one instance.
Remote device with Splunk Universal Forwarder
Splunk Universal Forwarder is a lightweight version of Splunk used to send data. The following two apps should be placed on every machine with a Splunk Universal Forwarder. All model outputs should go to that same folder structure on each machine to avoid having to modify the inputs app.
ersilia_all_outputs
: Sends data to the public IP of the AIO instance over port 9997.ersilia_inputs
: Collects data from all files in /var/log/ersilia_data.eos5axz_lake.csv
eos5axz.log
eos5axz.json
An example folder containing this data is available here. Data on this folder should be placed in
/var/log/ersilia_data
At the moment, this is installed in a local computer. The goal of this issue is to migrate the
ersilia_data
folder to the cloud.Ersilia CLI logger
We have written an RunLogger class that creates an
ersilia_runs
folder with the data in the format required by Splunk.In practice, the
ersilia_runs
folder has exactly the same structure than theersilia_data
folder.Next steps
My suggestion would be to have a S3 bucket where we store the
ersilia_runs
. Therefore, at the end of every run (locally, in GitHub Actions, etc.) there data will be uploaded to S3 (if permissions are available). Then, Splunk will monitor the S3 bucket and, whenever a change is made to the S3 bucket, it will be ingested and reflected in the dashboard that we already have.