iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.55k stars 1.17k forks source link

DVC log_plot error is confusing, if the `template` attribute is set without a value. #10482

Open h-joshi opened 1 month ago

h-joshi commented 1 month ago

Bug Report

Description

DVC log_plot error is confusing, if the template attribute is set without a value.

Reproduce

  1. Create a DVC repository
  2. Create a sample experiment that uses the log_plot function to log a few metrics (Note: the template option is empty)
    
    import pandas as pd
    from dvclive import Live
    from sklearn.datasets import load_iris

iris = load_iris() datapoints = pd.DataFrame(data=iris.data, columns=iris.feature_names)

with Live() as live: live.log_plot( "sepal", datapoints, x="sepal length (cm)", y="sepal width (cm)", template="", title="Sepal width vs Sepal length")

3. Run the experiment
4. Error will be something similar to the one below
```bash
vscode ➜ /workspaces/processor/.dvc/tmp (feature/strategy3) $ dvc plots diff -v
2024-07-14 09:57:01,538 DEBUG: v3.51.2 (pip), CPython 3.10.13 on Linux-6.6.31-linuxkit-x86_64-with-glibc2.31
2024-07-14 09:57:01,538 DEBUG: command: /home/vscode/.local/bin/dvc plots diff -v
2024-07-14 09:57:03,690 ERROR: unexpected error - [Errno 21] Is a directory: '/workspaces/processor/.dvc/tmp'                                                                                                                              
Traceback (most recent call last):
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc/cli/__init__.py", line 211, in main
    ret = cmd.do_run()
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc/cli/command.py", line 27, in do_run
    return self.run()
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc/commands/plots.py", line 107, in run
    renderers_with_errors = match_defs_renderers(
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc/render/match.py", line 131, in match_defs_renderers
    renderer = renderer_cls(plot_datapoints, renderer_id, **first_props)
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc_render/vega.py", line 87, in __init__
    self.template = get_template(
  File "/home/vscode/.local/lib/python3.10/site-packages/dvc_render/vega_templates.py", line 724, in get_template
    with _open(template_path, encoding="utf-8") as f:
IsADirectoryError: [Errno 21] Is a directory: '/workspaces/processor/.dvc/tmp'

2024-07-14 09:57:03,752 DEBUG: link type reflink is not available ([Errno 13] Permission denied: '/workspaces/.VcJKvhyzGauYU48vSH9qxQ.tmp')
2024-07-14 09:57:03,752 DEBUG: Removing '/workspaces/.VcJKvhyzGauYU48vSH9qxQ.tmp'
2024-07-14 09:57:03,753 DEBUG: link type hardlink is not available ([Errno 95] no more link types left to try out)
2024-07-14 09:57:03,753 DEBUG: Removing '/workspaces/.VcJKvhyzGauYU48vSH9qxQ.tmp'
2024-07-14 09:57:03,753 DEBUG: link type symlink is not available ([Errno 13] Permission denied: '/workspaces/processor/.dvc/cache/files/md5/.WaF1GsOTeDBjuMNIcuPZpw.tmp' -> '/workspaces/.VcJKvhyzGauYU48vSH9qxQ.tmp')
2024-07-14 09:57:03,753 DEBUG: Removing '/workspaces/.VcJKvhyzGauYU48vSH9qxQ.tmp'
2024-07-14 09:57:03,754 DEBUG: Removing '/workspaces/processor/.dvc/cache/files/md5/.WaF1GsOTeDBjuMNIcuPZpw.tmp'
2024-07-14 09:57:03,761 DEBUG: Version info for developers:
DVC version: 3.51.2 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-6.6.31-linuxkit-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 3.15.1
        dvc_objects = 5.1.0
        dvc_render = 1.0.2
        dvc_task = 0.4.0
        scmrepo = 3.3.6
Supports:
        http (aiohttp = 3.9.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.5, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2024.6.1, boto3 = 1.34.131)
Config:
        Global: /home/vscode/.config/dvc
        System: /etc/xdg/dvc
Cache types: 
Cache directory: fakeowner on /run/host_mark/Users
Caches: local
Remotes: s3
Workspace directory: fakeowner on /run/host_mark/Users
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/3e826ad9ce4b0c5c7907b07975f63692

Expected

Error should point out that a template needs to be specified if the attribute is specified.

Environment information

Output of dvc doctor:

DVC version: 3.51.2 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-6.6.31-linuxkit-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 3.15.1
        dvc_objects = 5.1.0
        dvc_render = 1.0.2
        dvc_task = 0.4.0
        scmrepo = 3.3.6
Supports:
        http (aiohttp = 3.9.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.5, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2024.6.1, boto3 = 1.34.131)
Config:
        Global: /home/vscode/.config/dvc
        System: /etc/xdg/dvc
Cache types: 
Cache directory: fakeowner on /run/host_mark/Users
Caches: local
Remotes: s3
Workspace directory: fakeowner on /run/host_mark/Users
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/3e826ad9ce4b0c5c7907b07975f63692

Additional Information (if any):

shcheklein commented 1 month ago

Thanks @h-joshi for the detailed report.

shcheklein commented 1 month ago

@h-joshi did you have some use case in mind to keep it empty?

I wonder if just should raise an exception on the empty value and do not allow it in the first place.

h-joshi commented 1 month ago

@shcheklein This should just raise an exception with a context relevant error message. I had missed adding the template and it took me a while to realise that the missing value was the root cause.

Shashank1202 commented 2 weeks ago

Hey, willing to work on this, can anyone help with finding where exactly log_plot function or any related function is present?