allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.43k stars 643 forks source link

Installed Packages only show those imported in the main script file #1198

Closed wxdrizzle closed 2 months ago

wxdrizzle commented 5 months ago

Describe the bug

Hi, thanks a lot for this excellent project! I really enjoyed it. I'm not sure if this is a bug, or just something I didn't capture. It seems that the Installed Packages section of a task in the WebApp only shows packages that are directly imported in the main file that is executed.

To reproduce

I'm running the code manually through a main file train.py, which contains very few lines to just import a train() function from another site:

from deep_kit import main

if __name__ == '__main__':
    main.train()

where deep_kit is a package I wrote to contain all the experiment codes, and I used pip -e install deep_kit to make it a "installed" package that can be used in any of my projects.

When I manually run the code, like python train.py --xxx, where --xxx specifies the yaml file I used for experiment config, the task appears in clearml GUI with all things good, except for the Installed Packages section:

image

As you can see, it ignores all the modules I imported inside deep_kit, including pytorch and other things. Moreover, if I manually add import numpy in train.py, then I can see numpy in the section; but if I add from models.aaa import AAA, which is another folder at the same level of train.py that contains aaa.py, then the section also didn't show the packages I imported in aaa.py.

Does this mean the section can only track the imported packages in the main file that is executed? Or did I miss any feature or way to make it good?

Many thanks!

Expected behaviour

The section should include all packages I have imported through all files I used in this experiment.

Environment

dvando commented 4 months ago

Hi, as far as I know it should lists all the installed packages in the environment, as stated in the docs. It will also lists all the packages that is not installed in the environment but still got used in the code (just in a different dropdown). I did almost the same thing as you did, and it correctly lists all the packages that I installed in the environment (I was using conda).

Yet it is indeed weird that in your case it only list those 2 packages, might need to investigate.

wxdrizzle commented 4 months ago

Hi @dvando ,

Thank you for your information! Would you mind kindly sharing part of your clearml.conf file (with sensitive info removed) with me? I'd just like to check if my config (especially the agent.package_manager entry) is different from yours. I'm using:

agent {
    git_user: ""
    git_pass: ""
    python_binary: "/home/xin/software/anaconda3/envs/research/bin/python"
    package_manager: {
        conda_env_as_base_docker: true
        type: conda
    }
}

Besides, I did more investigations on this, and the new results are very interesting: If I add a line import tmp in my train.py that I execute, where tmp.py is an empty file (no code inside it) in the same folder. Then, the WebApp shows all installed packages in the environment clearly:

image

But if I remove that import tmp, then only clearml and deep_kit will show in the "installed packages".

Previously, I guessed maybe the reason is because the WebApp only lists those packages that are explicitly imported, and because I use deep_kit as an installed package, which contains all my import operations such as import torch, so the WebApp does not list those. But you mentioned "it should list all the installed packages in the environment", and my new result shows that with a import tmp it indeed works well. So, I really agree with you that this is something needing investigation by the team. Thank you again and hopefully we can get answers.

wxdrizzle commented 2 months ago

#1245 provides detailed explanation to tackle this.