Closed yangwenzhuo08 closed 1 year ago
@yangwenzhuo08 thanks for your changes! This looks great. I've finished what you started in terms of restructuring the module. Now, merlion.dashboard
is fully integrated into Merlion itself. The dashboard's dependencies have been added as optional requirements in setup.py
, so the user can install the dashboard with pip install salesforce-merlion[dashboard]
. The user may manually start up the dashboard with python -m merlion.dashboard
, or from Unicorn with gunicorn -b 0.0.0.0:80 merlion.dashboard.sever:server
. Additionally, the dashboard is now able to handle exogenous regressors.
In terms of my original comments, can you add the documentation I requested previously? Besides, this, I have a couple new requests.
max_forecast_steps = None
to be a valid specification? It's actually the default setting for most models and is necessary for long-horizon forecasting.@aadyotb Thanks for the revision. For the forecasting tab, we can split train file and test file as the anomaly tab does. Well, to combine these two UIs (upload two files, upload a single file with a split fraction), I'm not sure what layout is better for it. Do you have suggestion on the UI design for this part? For forecasting, it may be straightforward, e.g., we have two dropdown lists, one for train file, the other for test file. And then we have a slider to set the split fraction which is used to split the training data into "train" and "validation". But for anomaly detection, such split has a problem when the number of labels is small, i.e., it is possible that the split validation dataset has no anomalies.
@yangwenzhuo08 I envision something like the following: you can have a radio box which can select "use same file for train/test" or "use separate test file". If you select "use same file for train/test", you get the slider where you specify the train/test fraction. If you select "use separate test file", you get a prompt to choose the test file. If you specify "use separate test file", the module should throw an error if the test data is not given. What do you think?
And in terms of anomaly detection, it's kind of a well-known issue that the labels are sparse. The evaluation metrics are implemented in such a way that they have reliable fallback options if there are no true positives present in the data. Maybe you can use the plot_anoms
helper function in merlion.plot
to plot the ground truth anomalies (if they are specified), and then also report the evaluation metrics on both train and test?
So the layout is like this:
Yes, this sounds good.
This PR implements a web-based visualization dashboard for Merlion. Users can get it set up by installing Merlion with the optional
dashboard
dependency, i.e.pip install salesforce-merlion[dashboard]
. Then, they can start it up withpython -m merlion.dashboard
, which will start up the dashboard on port 8050. The dashboard has 3 tabs: a file manager where users can upload CSV files & visualize time series; a forecasting tab where users can try different forecasting algorithms on different datasets; and an anomaly detection tab where users can try different anomaly detection algorithms on different datasets. This dashboard thus provides a no-code interface for users to rapidly experiment with different algorithms on their own data, and examine performance both qualitatively (through visualizations) and quantitatively (through evaluation metrics).We also provide a Dockerfile which runs the dashboard as a microservice on port 80. The Docker image can be built with
docker build . -t merlion-dash -f docker/dashboard/Dockerfile
from the Merlion root directory. It can be deployed withdocker run -dp 80:80 merlion-dash
.