Closed dberenbaum closed 1 year ago
I think that new experiment in the workspace (and when it's done - keep its color and selection in the tree) should be automatically selected, seems like a bug to me. We can expand the tree indeed.
I am hoping that the issue here is that the experiment takes 0.2s to run. That is not enough time for DVCLive to create a signal file and it get picked up by the extension.
Can you please share the project so that I can investigate.
I am hoping that the issue here is that the experiment takes 0.2s to run. That is not enough time for DVCLive to create a signal file and it get picked up by the extension.
Can you please share the project so that I can investigate.
Sorry, there's no existing project. I started from scratch and wrote that into a notebook to test onboarding end to end.
I realized later this must be a bug or else we would have noticed it earlier. The plots seem to be fixed if I delay the experiment time, so I think you are correct.
- In the "experiments" section in the sidebar, expand to show the new experiment and highlight it by default.
This is a new feature request I think. The sidebar shows the spinning circle for the workspace when the experiment is running, but it's a little confusing when it finishes and you only see the 2 hollow circles for workspace
and main
. Is it possible to uncollapse main
by default? It would be much clearer that a new experiment was created from the finished workspace experiment if you suddenly saw a new row appear in the sidebar after the spinner stops.
Now that we are naming experiments at the start, we can probably also do a better job inside dvc exp show
showing that the workspace is equivalent to sleek-sida
or whichever experiment is running there. What do you think @daavoo?
This is a new feature request I think. The sidebar shows the spinning circle for the workspace when the experiment is running, but it's a little confusing when it finishes and you only see the 2 hollow circles for workspace and main. Is it possible to uncollapse main by default?
This should only be a couple of lines of code. I can do this today.
Im hitting the same issue (when DVCLive exp is running there are no updates, workspace doesn't detect it, etc). On Codespaces, with the new @alex000kim repo, notebook tag: https://github.com/iterative/dvc-get-started-cv/tree/1-notebook-dvclive
Im hitting the same issue (when DVCLive exp is running there are no updates, workspace doesn't detect it, etc). On Codespaces, with the new @alex000kim repo, notebook tag: https://github.com/iterative/dvc-get-started-cv/tree/1-notebook-dvclive
Same issue (experiment runs very quickly) or different?
The initial issue. I don't think experiment is running too fast in this case.
@shcheklein I'm unable to recreate. For me the result is slow but as expected:
Yep, also not able to reproduce this.
I hit another issue, though. Exp failed, but VS Code is still considering it as a running exp:
https://user-images.githubusercontent.com/3659196/210304245-7da26bd4-2fd9-4db5-9fef-77c62d6e7c15.mov
For the failed exp, is it expected @daavoo @dberenbaum ? (this is a Codespaces, new @alex000kim project, after git checkout 1-notebook
):
Stack trace:
GitError Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/scmrepo/git/backend/pygit2.py:714, in Pygit2Backend.merge(self, rev, commit, msg, squash)
713 try:
--> 714 self.repo.merge(obj.id)
715 self.repo.index.write()
GitError: 1 uncommitted change would be overwritten by merge
The above exception was the direct cause of the following exception:
SCMError Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/dvc/repo/experiments/executor/local.py:230, in WorkspaceExecutor.init_git(self, repo, scm, stash_rev, entry, infofile, branch)
229 try:
--> 230 self.scm.merge(merge_rev, squash=True, commit=False)
231 except _SCMError as exc:
File ~/.local/lib/python3.10/site-packages/scmrepo/git/__init__.py:289, in Git._backend_func(self, name, *args, **kwargs)
288 func = getattr(backend, name)
--> 289 result = func(*args, **kwargs)
290 self._last_backend = key
File ~/.local/lib/python3.10/site-packages/scmrepo/git/backend/pygit2.py:717, in Pygit2Backend.merge(self, rev, commit, msg, squash)
716 except GitError as exc:
--> 717 raise SCMError("Merge failed") from exc
...
233 if branch:
234 self.scm.set_ref(EXEC_BRANCH, branch, symbolic=True)
GitMergeError: Exception occured in `DVCLiveCallback` when calling event `after_fit`:
Merge failed
The only uncommitted changes are:
Pip freeze:
aiobotocore==2.4.2
aiohttp==3.8.3
aiohttp-retry==2.8.3
aioitertools==0.11.0
aiosignal==1.3.1
amqp==5.1.1
antlr4-python3-runtime==4.9.3
appdirs==1.4.4
asttokens==2.2.1
async-timeout==4.0.2
asyncssh==2.13.0
atpublic==3.1.1
attrs==22.2.0
backcall==0.2.0
billiard==3.6.4.0
blis==0.7.9
boto3==1.24.59
botocore==1.27.59
catalogue==2.0.8
celery==5.2.7
certifi==2022.12.7
cffi==1.15.1
charset-normalizer==2.1.1
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
colorama==0.4.6
comm==0.1.2
commonmark==0.9.1
confection==0.0.3
configobj==5.0.6
contourpy==1.0.6
cryptography==39.0.0
cycler==0.11.0
cymem==2.0.7
debugpy==1.6.4
decorator==5.1.1
dictdiffer==0.9.0
diskcache==5.4.0
distro==1.8.0
dpath==2.1.3
dulwich==0.20.50
dvc==2.38.1
dvc-data==0.28.4
dvc-http==2.27.2
dvc-objects==0.14.0
dvc-render==0.0.15
dvc-s3==2.21.0
dvc-task==0.1.8
dvclive==1.3.0
entrypoints==0.4
executing==1.2.0
fastai==2.7.10
fastcore==1.5.27
fastdownload==0.0.7
fastprogress==1.0.3
filelock==3.9.0
flatten-dict==0.4.2
flufl.lock==7.1.1
fonttools==4.38.0
frozenlist==1.3.3
fsspec==2022.11.0
funcy==1.17
future==0.18.2
gitdb==4.0.10
GitPython==3.1.30
grandalf==0.6
hydra-core==1.3.1
idna==3.4
ipykernel==6.19.4
ipython==8.7.0
iterative-telemetry==0.0.6
jedi==0.18.2
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
jupyter_client==7.4.8
jupyter_core==5.1.2
kiwisolver==1.4.4
kombu==5.2.4
langcodes==3.3.0
MarkupSafe==2.1.1
matplotlib==3.6.2
matplotlib-inline==0.1.6
multidict==6.0.4
murmurhash==1.0.9
nanotime==0.5.2
nest-asyncio==1.5.6
networkx==2.8.8
numpy==1.24.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
omegaconf==2.3.0
packaging==22.0
pandas==1.5.2
parso==0.8.3
pathspec==0.9.0
pathy==0.10.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.4.0
platformdirs==2.6.2
preshed==3.0.8
prompt-toolkit==3.0.36
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
pycparser==2.21
pydantic==1.10.4
pydot==1.4.2
pygit2==1.11.1
Pygments==2.14.0
pygtrie==2.5.0
pyparsing==3.0.9
python-box==6.1.0
python-dateutil==2.8.2
pytz==2022.7
PyYAML==6.0
pyzmq==24.0.1
requests==2.28.1
rich==13.0.0
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.7
s3fs==2022.11.0
s3transfer==0.6.0
scikit-learn==1.2.0
scipy==1.9.3
scmrepo==0.1.4
shortuuid==1.0.11
shtab==1.5.8
six==1.16.0
smart-open==6.3.0
smmap==5.0.0
spacy==3.4.4
spacy-legacy==3.0.11
spacy-loggers==1.0.4
srsly==2.4.5
stack-data==0.6.2
tabulate==0.9.0
thinc==8.1.6
threadpoolctl==3.1.0
tomlkit==0.11.6
torch==1.13.1
torchvision==0.14.1
tornado==6.2
tqdm==4.64.1
traitlets==5.8.0
typer==0.7.0
typing_extensions==4.4.0
urllib3==1.26.13
vine==5.0.0
voluptuous==0.13.1
wasabi==0.10.1
wcwidth==0.2.5
wrapt==1.14.1
yarl==1.8.2
zc.lockfile==2.0
I would guess that the code is failing inside of the __exit__
block, here: https://github.com/iterative/dvclive/blob/main/src/dvclive/live.py#L382
The extension should still right the situation the next time that data is updated (https://github.com/iterative/vscode-dvc/blob/add-context-enum/extension/src/fileSystem/index.ts#L175).
For the failed exp, is it expected @daavoo @dberenbaum ? (this is a Codespaces, new @alex000kim project, after git checkout 1-notebook):
I'm taking a look. It is an error coming from dvc exp save
.
Aside from the origin of the problem, we should catch the exceptions coming from DVC and show a warning instead of raising them.
The extension should still right the situation the next time that data is updated (https://github.com/iterative/vscode-dvc/blob/add-context-enum/extension/src/fileSystem/index.ts#L175).
I think we need a better way to detect failures eventually (check pid from time to time?)
The onboarding experience looks way better now! I mostly only found minor issues that I don't think are worth prioritizing right now.
The one place where I still see some friction is seeing the results of my experiment (skip to 0:10, something's wrong with my video editing).
https://user-images.githubusercontent.com/2308172/209039833-be278d6d-aaea-4c51-8a45-12b6cba8d893.mp4
Some ways to better highlight the experiment:
These are mostly changes that I think were introduced in #2877. No need to revisit that whole discussion again, but can we bring back some of the auto-select behavior that got removed there? I'm still finding there is much more friction in finding my experiment results now than there was a few weeks ago.
Old experience for comparison:
https://user-images.githubusercontent.com/2308172/209039450-82f7467b-c9ab-44da-aefc-df5b6ce08f98.mov