facebookresearch / ClassyVision

An end-to-end PyTorch framework for image and video classification
https://classyvision.ai
MIT License
1.59k stars 278 forks source link

process hangs and no logs showing. #763

Closed jpainam closed 2 years ago

jpainam commented 3 years ago

🐛 Bug

I'm facing too many difficulties with this library. Going to move back to normal PyTorch. This is the code for training after following the Video classification code.

import time
import os

from classy_vision.trainer import LocalTrainer
from classy_vision.hooks import CheckpointHook
from classy_vision.hooks import LossLrMeterLoggingHook

hooks = [LossLrMeterLoggingHook(log_freq=1)]

checkpoint_dir = f"./checkpoints/checkpoint_{time.time()}"
os.mkdir(checkpoint_dir)
hooks.append(CheckpointHook(checkpoint_dir, input_args={}))
task = task.set_hooks(hooks)
trainer = LocalTrainer()
trainer.train(task)

This error ValueError: Cannot register duplicate param scheduler (step), keeps on coming randomly. Even after Kernel restarted in Jupyter.

mannatsingh commented 3 years ago

I am looking into the param scheduler issue in https://github.com/facebookresearch/ClassyVision/issues/762. For the hanging, it's probably just that the logging hasn't been set up. For posterity, the code is probably not hanging - it's just not logging anything. That can be fixed by running -

import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)