aimhubio / aim

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
https://aimstack.io
Apache License 2.0
4.93k stars 297 forks source link

Directory Structures, DataFrame Page, & Others #3142

Open u3Izx9ql7vW4 opened 2 months ago

u3Izx9ql7vW4 commented 2 months ago

🚀 Feature

Hi I've been using Aim pretty extensively almost every day for the last couple weeks. Overall the experience been really great -- orders of magnitude better than MLFlow, which was what I was using before. The UI is particularly well-suited for ablation studies.

To that end, I've noticed a few pain points and have turned into feature requests.

Runs/Metrics Page: Organization Some kind of directory structure that allows developers to organize runs. For example, I might want to organize my runs like so:

ModelA/
|   ModelA-VariationB/
|   |   run 2l3krjf9...
|   |   run sd0f9j3...
AblationTest2024/
|   FeatureX/
|   |   run 95zkhe...
|   |   run priwnc3...
DatasetTests/
|   DatasetX/
|   |   DatasetX-v2/
|   |   |  run bby47z...
|   |   |  run we5b6n..
Adhoc/
...

To take the directory analogy further, it would be great to be able to copy/move/paste runs.

It doesn't have to be modelled as a directory data-structure behind the scenes, it can be something like AWS's S3 directories, which is just a flat container of files where the illusion of directories are created with file names like dirA/subdirB/file , and the directory structure would only appears in the UI. I think there are packages that do this for you, maybe s3fs, but this isn't my area of expertise.

Metrics It would be great if the plot could show the run name or some kind of identifier when I hover over a line. Right now when I view display metrics on a graph, it shows a bunch of lines, and I have to look down at the bottom to see the color coating to know which run corresponds to what metric.

Capture d’écran 2024-04-28 à 14 14 44

It would also be nice to order the plots on this page. I think they're alphabetical at the moment. Usually I', displaying 3+ metrics simultaneously, and it's ordered from least important to most important, with the most important being dead last. This is a bit of a drag.

Scatter Plot

DataFrames Page Have a page dedicated to seeing difference between Dataframes of runs. There's already a really power "Show Table Diffs" feature, and this would be a killer feature to have when comparing dataframes. It would also be nice to apply filtering and display metrics in accordance to this filters simple aggregates, like count, simple mean / median, standard deviation (eg, format like count=50, mean=32 +/- 2.5) on the bottom right like they have it in Excel. Finally, the option to download the data as a CSV.

Getting started guide I don't recall coming across in the getting started guide that I needed to specify a repo parameter in the aim.Run function. I got stuck on this and almost gave up because I couldn't get the simple example to show up as a run in the UI.


Motivation

Organization While the product is marketed for large quantities of runs, I'm finding a little difficult to keep track of everything. Right now I have about 150 unarchived runs, and it's very quickly becoming a sea. Most of my runs are batched into themes, such as testing out a new variation of an existing model, seeing the effect of adding/removing a feature, etc. But these "themes" vary widely, hence the request for user-defined directories.

The request about copy/past request is related to benchmarking. If I want to create a new folder, I usually want to copy over a benchmark from a previous run, rather than re-running the benchmark every time I want to do a new set of experiments.

Moving request relates to reorganization. If I find that I have too many folders and they all fall under a unified theme, then it would be really useful to nest them.

Scatter Plot Scatter plot doesn't show step level data like on the Metric page. For example, if I have a run with a 100 logged data points as metrics, on the metrics page, it will show the 100 data points corresponding to each step, but the scatter page only shows one data point (presumably the last one).

DataFrames Page Manual examination of Dataframes is a pretty sizeable component of machine learning / statistical analysis. It's one thing to see aggregate statistics like RMSE, F1 Score, loss, etc. it's another to see where the model made its mistakes, and how large were the mistakes. Often the latter exercise is much more informative.

One concrete use case is seeing prediction versus target, then applying filters to see where model got things really wrong, and be able to compare across models/features, etc. So as far as artifacts go, this is a pretty important one.

SGevorg commented 2 weeks ago

Hi @u3Izx9ql7vW4 thanks for sharing such detailed. I am trying to unpack these requests. For organization purposes, it would be great to have one issue per request so we could track them better. Otherwise if we decide to ask around and dig deep into those issues, the context will be all over the place.

Under this issue I will focus only on the ORganization feature request 🙌

SGevorg commented 2 weeks ago

@u3Izx9ql7vW4 is it safe to say that from the perspective of having lots of runs, being able to organize them is as important as being able to compare them. Our core thing has been the comparison and we have only put together the experiments "abstraction" as a means to organize the runs. Having folded structure on top of the runs can be considered as a generalization of the experiments aim already has. What I am trying to understand is if basic folding is enough. Are there other hidden dimensions to these that's worth considering?