Open Zhong-Zi-Zeng opened 8 months ago
Hey, I had the same problem as you, here's my solution
First, a new hook needs to be defined. Here, I redefined the after_test_iter in LoggerHook and RuntimeInfoHook to use the test_dataloader to validate the model after each traing epoch.
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self):
pass
def val_step(self, model, data, optim_wrapper):
with optim_wrapper.optim_context(model):
data = model.data_preprocessor(data, True)
losses = model(**data, mode='loss') # type: ignore
parsed_losses, log_vars = model.parse_losses(losses)
return log_vars
def after_train_epoch(self, runner) -> None:
model = runner.model
model.train()
optim_wrapper = runner.optim_wrapper
dataloader = runner.test_dataloader
for hook in runner._hooks:
if isinstance(hook,(LoggerHook,)):
logger = hook
elif isinstance(hook,(RuntimeInfoHook,)):
runtimeinfo = hook
for i, data in enumerate(dataloader):
outputs = self.val_step(model, data, optim_wrapper)
getattr(runtimeinfo, 'after_test_iter')(runner, None, None, outputs)
getattr(logger, 'after_test_iter')(runner,i+1)
Next, I modified the files under path /root/anaconda3/lib/python3.11/site-packages/mmengine/hooks/logger_hook.py and /root/anaconda3/lib/python3.11/site-packages/mmengine/hooks/runtime_info_hook.py For the LoggerHook, add this code
def after_test_iter(self,
runner,
batch_idx: int,
data_batch: DATA_BATCH = None,
outputs: Optional[dict] = None) -> None:
"""Record logs after training iteration.
Args:
runner (Runner): The runner of the training process.
batch_idx (int): The index of the current batch in the train loop.
data_batch (dict tuple or list, optional): Data from dataloader.
outputs (dict, optional): Outputs from model.
"""
# Print experiment name every n iterations.
if self.every_n_train_iters(
runner, self.interval_exp_name) or (self.end_of_epoch(
runner.test_dataloader, batch_idx)):
exp_info = f'Exp name: {runner.experiment_name}'
runner.logger.info(exp_info)
if self.every_n_inner_iters(batch_idx, self.interval):
tag, log_str = runner.log_processor.get_log_after_iter(
runner, batch_idx, 'test')
elif (self.end_of_epoch(runner.test_dataloader, batch_idx)
and (not self.ignore_last
or len(runner.test_dataloader) <= self.interval)):
# `runner.max_iters` may not be divisible by `self.interval`. if
# `self.ignore_last==True`, the log of remaining iterations will
# be recorded (Epoch [4][1000/1007], the logs of 998-1007
# iterations will be recorded).
tag, log_str = runner.log_processor.get_log_after_iter(
runner, batch_idx, 'test')
else:
return
runner.logger.info(log_str)
runner.visualizer.add_scalars(
tag, step=runner.iter + 1, file_path=self.json_log_path)
For the RuntimeInfoHook, add this code
def after_test_iter(self,
runner,
batch_idx: int,
data_batch: DATA_BATCH = None,
outputs: Optional[dict] = None) -> None:
"""Update ``log_vars`` in model outputs every iteration.
Args:
runner (Runner): The runner of the training process.
batch_idx (int): The index of the current batch in the train loop.
data_batch (Sequence[dict], optional): Data from dataloader.
Defaults to None.
outputs (dict, optional): Outputs from model. Defaults to None.
"""
if outputs is not None:
for key, value in outputs.items():
runner.message_hub.update_scalar(f'test/{key}', value)
Then, change something about test_dataloader in the config file
costum_hooks = [
...
dict(type='MyHook'),
...
]
test_dataloader= dict(
...
collate_fn=dict(type='yolov5_collate'),
pipeline=[
...
dict(type='LoadAnnotations', with_bbox=True),
...
],
...
)
Finally, run train.py, and you can see the validation loss in the results, and it will also be saved in the log in json format
@g824718114 Thank you for providing the method, but may I ask for some modifications to the test_dataloader in the configuration file
Collate_fn=dict (type='yolov5_collate ')
Where is this defined? Because I haven't used YOLOv5, or is this a universal modification method?
I'm sorry that I didn't make it clear. This is not a universal modification. The collate_fn in test_dataloader need to be the same as train_dataloader.
---Original--- From: @.> Date: Tue, Jun 11, 2024 09:42 AM To: @.>; Cc: "Guo @.**@.>; Subject: Re: [open-mmlab/mmdetection] How to use hook to get the validationloss? (Issue #11331)
@g824718114 Thank you for providing the method, but may I ask for some modifications to the test_dataloader in the configuration file Collate_fn=dict (type='yolov5_collate ') Where is this defined? Because I haven't used YOLOv5, or is this a universal modification method?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Recently, I have implemented a simple method that wants to get the validation loss. The code below
Although, it can correctly load the validation data, I get this error message
So, I printed the height and width, I found that it would change every time. But when I try to run the below code it can work correctly
Could someone help me with that, I will appreciate.