Open aazuspan opened 8 months ago
Just want to confirm how severe it is for you. It sounds like you can work around the problem and are reporting more to let us know. Is that right?
That's right - importing dvclive
before tensorflow
is an easy workaround, so this doesn't break my usage.
I'm not able to reproduce this so far. If I try with the code you showed and no file.csv
, I only get the warning WARNING:dvclive:Error in cache: output 'file.csv' does not exist
. No error is raised.
I didn't mention in the OP, but I've only encountered the error in notebooks. I'm able to reproduce it consistently in Colab: https://colab.research.google.com/drive/1aEkYJdQ6cYiu2ICFppg8WrJwr0y6-SG4?usp=sharing
In this case, I get the warning because there are no commits, which triggers the ValueError
.
Thanks for looking into it!
Looks like it may be related to https://github.com/iterative/dvc/issues/8387
Confirmed that dropping this line fixes the issue:
https://github.com/iterative/dvc/blob/b46bd9ca654f5e2dbeeb31f4eb6acaf80a7fde0a/dvc/logger.py#L182
@skshetry Looks like you were testing this in https://github.com/iterative/dvc/pull/8985. Do you have any idea what's wrong or how to handle it?
I'm running into a strange bug whenever
Live.log_artifact
tries to raise a warning or error withtensorflow
imported. It should be reproducible below (tensorflow==2.13.0
,dvc==3.38.1
, anddvclive==3.5.1
) as long as you have uncommitted changes in the repo (or some other way to raise a warning/error, like a missing artifact path).raises
Full traceback
```python --------------------------------------------------------------------------- ValueError Traceback (most recent call last) File /usr/local/lib/python3.8/dist-packages/dvc/logger.py:156, in LoggerHandler.emit(self, record) 155 msg = self.format(record) --> 156 Tqdm.write(msg, file=self.stream, end=getattr(self, "terminator", "\n")) 157 self.flush() File /usr/local/lib/python3.8/dist-packages/tqdm/std.py:725, in tqdm.write(cls, s, file, end, nolock) 723 with cls.external_write_mode(file=file, nolock=nolock): 724 # Write the message --> 725 fp.write(s) 726 fp.write(end) File /usr/local/lib/python3.8/dist-packages/colorama/ansitowin32.py:47, in StreamWrapper.write(self, text) 46 def write(self, text): ---> 47 self.__convertor.write(text) File /usr/local/lib/python3.8/dist-packages/colorama/ansitowin32.py:177, in AnsiToWin32.write(self, text) 176 if self.strip or self.convert: --> 177 self.write_and_convert(text) 178 else: File /usr/local/lib/python3.8/dist-packages/colorama/ansitowin32.py:205, in AnsiToWin32.write_and_convert(self, text) 204 cursor = end --> 205 self.write_plain_text(text, cursor, len(text)) File /usr/local/lib/python3.8/dist-packages/colorama/ansitowin32.py:210, in AnsiToWin32.write_plain_text(self, text, start, end) 209 if start < end: --> 210 self.wrapped.write(text[start:end]) 211 self.wrapped.flush() File /usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py:633, in OutStream.write(self, string) 632 msg = "I/O operation on closed file" --> 633 raise ValueError(msg) 634 else: ValueError: I/O operation on closed file During handling of the above exception, another exception occurred: ValueError Traceback (most recent call last) Cell In[1], line 5 2 from dvclive import Live 4 with Live() as live: ----> 5 live.log_artifact("file.csv") File /usr/local/lib/python3.8/dist-packages/dvclive/live.py:628, in Live.__exit__(self, exc_type, exc_val, exc_tb) 626 def __exit__(self, exc_type, exc_val, exc_tb): 627 self._inside_with = False --> 628 self.end() File /usr/local/lib/python3.8/dist-packages/dvclive/live.py:605, in Live.end(self) 600 if self._dvcyaml: 601 catch_and_warn(DvcException, logger)(self._dvc_repo.scm.add)( 602 self.dvc_file 603 ) --> 605 self.save_dvc_exp() 607 # Mark experiment as done 608 self.post_to_studio("done") File /usr/local/lib/python3.8/dist-packages/dvclive/utils.py:183, in catch_and_warn.From what I can tell from the traceback, DVC tries to raise a warning for uncommited changes, which gets logged to the closed
stderr
stream, causing the I/O error. The same error will occur for other errors or warnings raised bydvclive
as well. I found a few hints that might help diagnose the issue below, but I think I've reached the limit of what I'm able to figure out.tensorflow
is imported beforedvclive
. When the order is reversed,sys.stderr
remains open and no error is encountered.sys.stderr
getting closed during testing.sys.stderr
, it becomes closed as soon asgto
is imported bylog_artifact
. Disabling that import and the related calls resolves the issue. Looking at the init for thegto.log
module shows that it's setting some logging config. https://github.com/iterative/dvclive/blob/5c650ef55a0bc74b7025a861f6127ba217a07e63/src/dvclive/live.py#L498-L499It's possible this is ultimately a Tensorflow/Keras issue rather than a DVC issue, but I'm not sure how to reproduce it without DVC.
Any help would be appreciated, thanks!