Closed DFuller134 closed 3 months ago
Looking through your log (search for "Traceback") I see this issue:
comet_ml.vendor.nvidia_ml.pynvml.NVMLError_NotSupported: Not Supported
but that shouldn't cause any issues. I also see:
[[13.3],
[33.9],
[54.5],
[75.1],
[95.7]]
ValueError: can only convert an array of size 1 to a Python scalar
which could be a Comet bug.
Also:
ModuleNotFoundError: No module named 'graphviz'
Pip install graphviz (or another dot package) to see if that helps.
I addressed each of these issues except the NVML error (related to GPU drivers likely needed to log GPU metrics). The ValueError cleared up when I set COMET_DISABLE_AUTO_LOGGING=1. I also installed graphviz to clear up that issue.
I would agree that the ValueError seems like a CometML bug.
@DFuller134 thanks for your update! I'll pass on the details of the NVMLError_NotSupported
error to our engineering team.
Do you know what [[33.9], [54.5], [75.1], [95.7]]
is? If you are trying to log a parameter (or step or epoch) value, it can't be a list of values. I believe that these are the only places that this error could come from.
Describe the Bug
After importing comet_ml a scikit-learn-based training script fails during sklearn grid search cross-validation: "broken pipe" exception in joblib. Works fine without import of comet_ml.
Expected behavior
Training script should execute to completion with import of comet_ml
Where is the issue?
Third Party Integrations (scikit-learn). Stack trace indicates calls into comet_ml monkey-patching.
To Reproduce
Steps to reproduce the behavior:
Stack Trace
Comet Debug Log
comet.log
Screenshots or GIFs
N/A
Additional context (code fragment - fails on grid.fit)