Closed brainie closed 2 years ago
what's your log file? it's difficult to understand what went wrong without knowing more details
here is log.txt file for seed1
** Arguments **
***************
backbone:
config_file: configs/trainers/StyleMatch/ssdg_pacs_v1.yaml
dataset_config_file: configs/datasets/ssdg_pacs.yaml
eval_only: False
head:
load_epoch: None
model_dir:
no_train: False
opts: ['MODEL.BACKBONE.NAME', 'resnet18', 'DATASET.NUM_LABELED', '210']
output_dir: output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting/seed1
resume:
root: /home/johnsonibironke/kaiyang/data
seed: 1
source_domains: ['cartoon', 'photo', 'sketch']
target_domains: ['art_painting']
trainer: StyleMatch
transforms: None
************
** Config **
************
DATALOADER:
K_TRANSFORMS: 1
NUM_WORKERS: 4
RETURN_IMG0: True
TEST:
BATCH_SIZE: 100
SAMPLER: SequentialSampler
TRAIN_U:
BATCH_SIZE: 32
N_DOMAIN: 0
N_INS: 16
SAME_AS_X: True
SAMPLER: RandomSampler
TRAIN_X:
BATCH_SIZE: 48
N_DOMAIN: 0
N_INS: 16
SAMPLER: SeqDomainSampler
DATASET:
ALL_AS_UNLABELED: False
CIFAR_C_LEVEL: 1
CIFAR_C_TYPE:
NAME: SSDGPACS
NUM_LABELED: 210
NUM_SHOTS: -1
ROOT: /home/johnsonibironke/kaiyang/data
SOURCE_DOMAINS: ['cartoon', 'photo', 'sketch']
STL10_FOLD: -1
TARGET_DOMAINS: ['art_painting']
VAL_PERCENT: 0.1
INPUT:
COLORJITTER_B: 0.4
COLORJITTER_C: 0.4
COLORJITTER_H: 0.1
COLORJITTER_S: 0.4
CROP_PADDING: 4
CUTOUT_LEN: 16
CUTOUT_N: 1
GB_K: 21
GB_P: 0.5
GN_MEAN: 0.0
GN_STD: 0.15
INTERPOLATION: bilinear
NO_TRANSFORM: False
PIXEL_MEAN: [0.485, 0.456, 0.406]
PIXEL_STD: [0.229, 0.224, 0.225]
RANDAUGMENT_M: 10
RANDAUGMENT_N: 2
RGS_P: 0.2
SIZE: (224, 224)
TRANSFORMS: ('random_flip', 'random_translation', 'normalize')
MODEL:
BACKBONE:
NAME: resnet18
PRETRAINED: True
HEAD:
ACTIVATION: relu
BN: True
DROPOUT: 0.0
HIDDEN_LAYERS: ()
NAME:
INIT_WEIGHTS:
OPTIM:
ADAM_BETA1: 0.9
ADAM_BETA2: 0.999
BASE_LR_MULT: 0.1
GAMMA: 0.1
LR: 0.003
LR_SCHEDULER: cosine
MAX_EPOCH: 40
MOMENTUM: 0.9
NAME: sgd
NEW_LAYERS: ()
RMSPROP_ALPHA: 0.99
SGD_DAMPNING: 0
SGD_NESTEROV: False
STAGED_LR: False
STEPSIZE: (-1,)
WARMUP_CONS_LR: 1e-05
WARMUP_EPOCH: -1
WARMUP_MIN_LR: 1e-05
WARMUP_RECOUNT: True
WARMUP_TYPE: linear
WEIGHT_DECAY: 0.0005
OUTPUT_DIR: output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting/seed1
RESUME:
SEED: 1
TEST:
COMPUTE_CMAT: False
EVALUATOR: Classification
FINAL_MODEL: last_step
NO_TEST: False
PER_CLASS_RESULT: False
SPLIT: test
TRAIN:
CHECKPOINT_FREQ: 0
COUNT_ITER: train_u
PRINT_FREQ: 10
TRAINER:
CG:
ALPHA_D: 0.5
ALPHA_F: 0.5
EPS_D: 1.0
EPS_F: 1.0
DAEL:
CONF_THRE: 0.95
STRONG_TRANSFORMS: ()
WEIGHT_U: 0.5
DDAIG:
ALPHA: 0.5
CLAMP: False
CLAMP_MAX: 1.0
CLAMP_MIN: -1.0
G_ARCH:
LMDA: 0.3
WARMUP: 0
ENTMIN:
LMDA: 0.001
FIXMATCH:
CONF_THRE: 0.95
STRONG_TRANSFORMS: ()
WEIGHT_U: 1.0
M3SDA:
LMDA: 0.5
N_STEP_F: 4
MCD:
N_STEP_F: 4
MEANTEA:
EMA_ALPHA: 0.999
RAMPUP: 5
WEIGHT_U: 1.0
MIXMATCH:
MIXUP_BETA: 0.75
RAMPUP: 20000
TEMP: 2.0
WEIGHT_U: 100.0
MME:
LMDA: 0.1
NAME: StyleMatch
SE:
CONF_THRE: 0.95
EMA_ALPHA: 0.999
RAMPUP: 300
STYLEMATCH:
ADAIN_DECODER: weights/decoder.pth
ADAIN_VGG: weights/vgg_normalised.pth
APPLY_AUG: True
APPLY_STY: True
CLASSIFIER: stochastic
CONF_THRE: 0.95
C_OPTIM:
ADAM_BETA1: 0.9
ADAM_BETA2: 0.999
BASE_LR_MULT: 0.1
GAMMA: 0.1
LR: 0.01
LR_SCHEDULER: cosine
MAX_EPOCH: 40
MOMENTUM: 0.9
NAME: sgd
NEW_LAYERS: ()
RMSPROP_ALPHA: 0.99
SGD_DAMPNING: 0
SGD_NESTEROV: False
STAGED_LR: False
STEPSIZE: (-1,)
WARMUP_CONS_LR: 1e-05
WARMUP_EPOCH: -1
WARMUP_MIN_LR: 1e-05
WARMUP_RECOUNT: True
WARMUP_TYPE: linear
WEIGHT_DECAY: 0.0005
INFERENCE_MODE: deterministic
N_ENSEMBLE: 10
SAVE_SIGMA: False
STRONG_TRANSFORMS: ('random_flip', 'randaugment_fixmatch', 'normalize', 'cutout')
USE_CUDA: True
VERBOSE: True
VERSION: 1
Collecting env info ...
** System info **
PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.7 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] torch==1.8.1
[pip3] torchvision==0.9.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.3.0 h06a4308_520
[conda] mkl-service 2.4.0 py37h7f8727e_0
[conda] mkl_fft 1.3.0 py37h42c9631_2
[conda] mkl_random 1.2.2 py37h51133e4_0
[conda] numpy 1.21.2 pypi_0 pypi
[conda] numpy-base 1.20.3 py37h74d4b33_0
[conda] pytorch 1.8.1 py3.7_cuda10.1_cudnn7.6.3_0 pytorch
[conda] torchvision 0.9.1 py37_cu101 pytorch
Pillow (8.3.1)
Loading trainer: StyleMatch
Building transform_train
+ resize to 224x224
+ random flip
+ random translation
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Building transform_train
+ resize to 224x224
+ random flip
+ randaugment_fixmatch (n=2)
+ to torch tensor of range [0, 1]
+ cutout (n_holes=1, length=16)
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Loading dataset: SSDGPACS
Reading split from "/home/johnsonibironke/kaiyang/data/pacs/splits_ssdg/art_painting_nlab210_seed1.json"
* Using custom transform for training
Building transform_test
+ resize to 224x224
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
***** Dataset statistics *****
Dataset: SSDGPACS
Source domains: ['cartoon', 'photo', 'sketch']
Target domains: ['art_painting']
# classes: 7
# train_x: 210
# train_u: 6,926
# val: 806
# test: 2,048
Building G
Backbone: resnet18
# params: 11,176,512
Building C
# params: 7,168
Loading evaluator: Classification
Building vgg and decoder for style transfer
Loading decoder weights from weights/decoder.pth
Loading vgg weights from weights/vgg_normalised.pth
No checkpoint found, train from scratch
Initializing summary writer for tensorboard with log_dir=output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting/seed1
There are 4 other seeds. Here is the attached log.txt log.txt
what does the directory structure look like?
what's the command you used?
what's the error message you got?
how reproduce your error?
This is how the directory structure look like for seed1 (where the log.txt file is located)
home/johnsonibironke/Documents/Github_files/ssdg-benchmark/output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting/seed1
This is the command i used
python parse_test_res.py output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1 --multi-exp
This is the error message
Parsing files in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting
Traceback (most recent call last):
File "parse_test_res.py", line 188, in <module>
main(args, end_signal)
File "parse_test_res.py", line 147, in main
end_signal=end_signal
File "parse_test_res.py", line 97, in parse_function
assert len(outputs) > 0, f'Nothing found in {directory}'
AssertionError: Nothing found in output/ssdg_pacs/nlab_210/StyleMatch/resnet18/v1/art_painting
The error did not reproduce.
Thank you very much
Your log file doesn't contain any results for the code to extract, since it ends with Initializing summary writer
.
What the code does is after seeing Finished training
, it detects the keywords, accuracy
and error
, and extracts the values.
I notice that too.
So i tried running tools/train.py from dassl.pytorch repo, to train the data from scratch, but this was the result.
***************
** Arguments **
***************
backbone:
config_file:
dataset_config_file:
eval_only: False
head:
load_epoch: None
model_dir:
no_train: False
opts: []
output_dir:
resume:
root:
seed: -1
source_domains: None
target_domains: None
trainer:
transforms: None
************
** Config **
************
DATALOADER:
K_TRANSFORMS: 1
NUM_WORKERS: 4
RETURN_IMG0: False
TEST:
BATCH_SIZE: 32
SAMPLER: SequentialSampler
TRAIN_U:
BATCH_SIZE: 32
N_DOMAIN: 0
N_INS: 16
SAME_AS_X: True
SAMPLER: RandomSampler
TRAIN_X:
BATCH_SIZE: 32
N_DOMAIN: 0
N_INS: 16
SAMPLER: RandomSampler
DATASET:
ALL_AS_UNLABELED: False
CIFAR_C_LEVEL: 1
CIFAR_C_TYPE:
NAME:
NUM_LABELED: -1
NUM_SHOTS: -1
ROOT:
SOURCE_DOMAINS: ()
STL10_FOLD: -1
TARGET_DOMAINS: ()
VAL_PERCENT: 0.1
INPUT:
COLORJITTER_B: 0.4
COLORJITTER_C: 0.4
COLORJITTER_H: 0.1
COLORJITTER_S: 0.4
CROP_PADDING: 4
CUTOUT_LEN: 16
CUTOUT_N: 1
GB_K: 21
GB_P: 0.5
GN_MEAN: 0.0
GN_STD: 0.15
INTERPOLATION: bilinear
NO_TRANSFORM: False
PIXEL_MEAN: [0.485, 0.456, 0.406]
PIXEL_STD: [0.229, 0.224, 0.225]
RANDAUGMENT_M: 10
RANDAUGMENT_N: 2
RGS_P: 0.2
SIZE: (224, 224)
TRANSFORMS: ()
MODEL:
BACKBONE:
NAME:
PRETRAINED: True
HEAD:
ACTIVATION: relu
BN: True
DROPOUT: 0.0
HIDDEN_LAYERS: ()
NAME:
INIT_WEIGHTS:
OPTIM:
ADAM_BETA1: 0.9
ADAM_BETA2: 0.999
BASE_LR_MULT: 0.1
GAMMA: 0.1
LR: 0.0003
LR_SCHEDULER: single_step
MAX_EPOCH: 10
MOMENTUM: 0.9
NAME: adam
NEW_LAYERS: ()
RMSPROP_ALPHA: 0.99
SGD_DAMPNING: 0
SGD_NESTEROV: False
STAGED_LR: False
STEPSIZE: (-1,)
WARMUP_CONS_LR: 1e-05
WARMUP_EPOCH: -1
WARMUP_MIN_LR: 1e-05
WARMUP_RECOUNT: True
WARMUP_TYPE: linear
WEIGHT_DECAY: 0.0005
OUTPUT_DIR: ./output
RESUME:
SEED: -1
TEST:
COMPUTE_CMAT: False
EVALUATOR: Classification
FINAL_MODEL: last_step
NO_TEST: False
PER_CLASS_RESULT: False
SPLIT: test
TRAIN:
CHECKPOINT_FREQ: 0
COUNT_ITER: train_x
PRINT_FREQ: 10
TRAINER:
CG:
ALPHA_D: 0.5
ALPHA_F: 0.5
EPS_D: 1.0
EPS_F: 1.0
DAEL:
CONF_THRE: 0.95
STRONG_TRANSFORMS: ()
WEIGHT_U: 0.5
DDAIG:
ALPHA: 0.5
CLAMP: False
CLAMP_MAX: 1.0
CLAMP_MIN: -1.0
G_ARCH:
LMDA: 0.3
WARMUP: 0
ENTMIN:
LMDA: 0.001
FIXMATCH:
CONF_THRE: 0.95
STRONG_TRANSFORMS: ()
WEIGHT_U: 1.0
M3SDA:
LMDA: 0.5
N_STEP_F: 4
MCD:
N_STEP_F: 4
MEANTEA:
EMA_ALPHA: 0.999
RAMPUP: 5
WEIGHT_U: 1.0
MIXMATCH:
MIXUP_BETA: 0.75
RAMPUP: 20000
TEMP: 2.0
WEIGHT_U: 100.0
MME:
LMDA: 0.1
NAME:
SE:
CONF_THRE: 0.95
EMA_ALPHA: 0.999
RAMPUP: 300
USE_CUDA: True
VERBOSE: True
VERSION: 1
Collecting env info ...
** System info **
PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.7 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] torch==1.8.1
[pip3] torchvision==0.9.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.3.0 h06a4308_520
[conda] mkl-service 2.4.0 py37h7f8727e_0
[conda] mkl_fft 1.3.0 py37h42c9631_2
[conda] mkl_random 1.2.2 py37h51133e4_0
[conda] numpy 1.21.2 pypi_0 pypi
[conda] numpy-base 1.20.3 py37h74d4b33_0
[conda] pytorch 1.8.1 py3.7_cuda10.1_cudnn7.6.3_0 pytorch
[conda] torchvision 0.9.1 py37_cu101 pytorch
Pillow (8.3.1)
Traceback (most recent call last):
File "train.py", line 190, in <module>
main(args)
File "train.py", line 106, in main
trainer = build_trainer(cfg)
File "/home/johnsonibironke/Documents/Github_files/Dassl.pytorch/dassl/engine/build.py", line 8, in build_trainer
check_availability(cfg.TRAINER.NAME, avai_trainers)
File "/home/johnsonibironke/Documents/Github_files/Dassl.pytorch/dassl/utils/tools.py", line 178, in check_availability
'(do you mean [{}]?)'.format(available, requested, psb_ans)
ValueError: The requested one is expected to belong to ['MCD', 'MME', 'ADDA', 'DAEL', 'DANN', 'AdaBN', 'M3SDA', 'SourceOnly', 'SelfEnsembling', 'DDAIG', 'DAELDG', 'Vanilla', 'CrossGrad', 'EntMin', 'FixMatch', 'MixMatch', 'MeanTeacher', 'SupBaseline'], but got [] (do you mean [SupBaseline]?)
Do you think this can affect the training of the ssdg-benchmark? and what can i do to complete the training?
did you follow exactly the instruction outlined here https://github.com/KaiyangZhou/ssdg-benchmark#how-to-run-stylematch?
it would be easier for me to identify your problem if you could provide more details on how you reached that error, like any changes you made to the code or what steps you followed (after installation).
Yes i did follow the instruction from https://github.com/KaiyangZhou/ssdg-benchmark#how-to-run-stylematch. I made no changes to the code.
But i am guessing the problem came from not running the train.py
from dassl.pytorch repo, (that is supposed to train the data from scratch) when i download the data set from the start.
However when i tried running it. It gave me the error i posted in my comment immediately above yours.
I've checked the code and find no issue with the running
what do you see if you run
conda activate dassl
cd ssdg-benchmark/scripts/StyleMatch
bash run_ssdg.sh ssdg_pacs 210 v1
it runs in about 20 iterations and give an output folder
here is the folder output.zip
do you see the results in the log files?
like this (in the very end)
...
Finished training
Do evaluation on test set
=> result
* total: 2,048
* correct: 1,605
* accuracy: 78.37%
* error: 21.63%
if so, parse_test_log.py
should work (it will extract the results values e.g. 78.37
for accuracy
)
Looking through the code, I found this
Turns out that inside
parse_function
, the regex isn't matched with what is in thelog.txt
. This makes it not populate the output variable of the code shown below.What am I doing wrong?