Closed Jia-Baos closed 2 years ago
@Jia-Baos, I cannot reproduce this issue. Here is what I get when I run patchcore
Would it be because of your hardware configuration such that validation takes long time?
To double check this, you could change the model to
model:
name: patchcore
backbone: resnet18
pre_trained: true
layers:
- layer2
- layer3
coreset_sampling_ratio: 0.1
num_neighbors: 9
normalization_method: min_max # options: [null, min_max, cdf]
or
model:
name: patchcore
backbone: resnet18
pre_trained: true
layers:
- layer3
coreset_sampling_ratio: 0.1
num_neighbors: 9
normalization_method: min_max # options: [null, min_max, cdf]
to make the model more lightweight.
Thank you so much, i have adopted your recommendations and changed the model, you're right, it needs to take a long time to validation..............
We have just merged a PR #580, which partially addresses this. See #268 #533.
I'll be converting this to a Q&A in Discussions. Feel free to continue from there. Cheers!
Describe the bug
-when i using the patchcore to training data(MVTec bottle), there appeared some error, just like this---Validation: 0it [00:00, ?it/s], the process can't continue
To Reproduce
Steps to reproduce the behavior:
nothing
Expected behavior
C:\Users\fx50j.conda\envs\anomalib_env\python.exe D:/PythonProject/anomalib/tools/MyTest.py
1.12.0+cpu None None False 0
Transform configs has not been provided. Images will be normalized using ImageNet statistics. Transform configs has not been provided. Images will be normalized using ImageNet statistics. C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\torch\utils\data\dataloader.py:557: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4 (
cpuset
is not taken into account), which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( dict_keys(['image', 'image_path', 'label', 'mask_path', 'mask']) torch.Size([1, 3, 224, 224]) torch.Size([1, 224, 224]) C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\torchmetrics\utilities\prints.py:36: UserWarning: MetricPrecisionRecallCurve
will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint. warnings.warn(*args, *kwargs) D:\PythonProject\anomalib\anomalib\utils\callbacks__init__.py:133: UserWarning: Export option: None not found. Defaulting to no model export warnings.warn(f"Export option: {config.optimization.export_mode} not found. Defaulting to no model export") GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUsTrainer(limit_train_batches=1.0)
was configured so 100% of the batches per epoch will be used..Trainer(limit_val_batches=1.0)
was configured so 100% of the batches will be used..Trainer(limit_test_batches=1.0)
was configured so 100% of the batches will be used..Trainer(limit_predict_batches=1.0)
was configured so 100% of the batches will be used..Trainer(val_check_interval=1.0)
was configured so validation will run at the end of the training epoch.. Missing logger folder: results\patchcore\mvtec\bottle\lightning_logs C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\torchmetrics\utilities\prints.py:36: UserWarning: MetricROC
will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint. warnings.warn(args, **kwargs) C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\pytorch_lightning\core\optimizer.py:183: UserWarning:LightningModule.configure_optimizers
returnedNone
, this fit will run with no optimizer rank_zero_warn(| Name | Type | Params
0 | image_threshold | AdaptiveThreshold | 0
1 | pixel_threshold | AdaptiveThreshold | 0
2 | model | PatchcoreModel | 24.9 M 3 | image_metrics | AnomalibMetricCollection | 0
4 | pixel_metrics | AnomalibMetricCollection | 0
5 | normalization_metrics | MinMax | 0
24.9 M Trainable params 0 Non-trainable params 24.9 M Total params 99.450 Total estimated model params size (MB) C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\torch\utils\data\dataloader.py:557: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4 (
cpuset
is not taken into account), which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py:1933: PossibleUserWarning: The number of training batches (7) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch. rank_zero_warn( Epoch 0: 1%| | 1/90 [01:07<1:40:34, 67.80s/it, loss=nan, v_num=0]C:\Users\fx50j.conda\envs\anomalib_env\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py:137: UserWarning:training_step
returnedNone
. If this was on purpose, ignore this warning... self.warning_cache.warn("training_step
returnedNone
. If this was on purpose, ignore this warning...") Epoch 0: 8%|▊ | 7/90 [01:51<22:00, 15.91s/it, loss=nan, v_num=0] Validation: 0it [00:00, ?it/s]Screenshots
Hardware and Software Configuration
Additional context