summary:
I have tried to train a model on the same dataset with the same random_seed (42 default) and I always get a different final model (text classifier: negative, positive, neutral categories). I might be doing something totally wrong or the random_seed parameter isn't controlling all the variability.
text sample (romanian):
Un telefon slab pentru "statutul" de premium. L-am returnat. Cand zic slab ma refer la urmatoarele minusuri: - este doar SuperAmoled Plus - are doar 393 ppi la un ecran de 6.7 " (un simplu Huawei P30 are 422 ppi, la un ecran de 6.1 " ). Am pus acelasi film 4k la ambele iar diferenta este uriasa: P30 are culori vii si claritate f. buna, in schimb Note 20 are culori putin sterse in multe cadre si are o luminozitate considerabil mai mica - rezolutie doar 1080 x 2340 pixeli la un ecran asa mare! (P30 are 1080 x 2400) - bateria se descarca destul de repede la o folosire medie (100% dimineata, pe la ora 18-19 mai sunt 15 %) - camera foto nu m-a impresionat (vechiul P30 face poze mai bune) - amprenta o recunoaste in 70-80% din cazuri din prima (P30 recunoaste amprenta 100% din cazuri din prima) Concluzie: - daca vrei ceva premium : Note 10 Plus(are Dynamic Amoled, 1440 x 3040 pixeli, 498 ppi) sau Note 20 Ultra (are Dynamic Amoles 2X, 1440 x 3088, 496 ppi) - daca nu te uiti la filme si iti doresti 256 GB memorie, atunci ia-ti Note 20. Daca insa vrei rezolutie buna, baterie buna si poze excelente, mergi spre Huawei
summary: I have tried to train a model on the same dataset with the same random_seed (42 default) and I always get a different final model (text classifier: negative, positive, neutral categories). I might be doing something totally wrong or the random_seed parameter isn't controlling all the variability.
env: OS: Ubuntu 20.04.2 LTS Python: 3.8.10 ludwig_version: '0.3.3' tf_version: '2.4.3'
text sample (romanian): Un telefon slab pentru "statutul" de premium. L-am returnat. Cand zic slab ma refer la urmatoarele minusuri: - este doar SuperAmoled Plus - are doar 393 ppi la un ecran de 6.7 " (un simplu Huawei P30 are 422 ppi, la un ecran de 6.1 " ). Am pus acelasi film 4k la ambele iar diferenta este uriasa: P30 are culori vii si claritate f. buna, in schimb Note 20 are culori putin sterse in multe cadre si are o luminozitate considerabil mai mica - rezolutie doar 1080 x 2340 pixeli la un ecran asa mare! (P30 are 1080 x 2400) - bateria se descarca destul de repede la o folosire medie (100% dimineata, pe la ora 18-19 mai sunt 15 %) - camera foto nu m-a impresionat (vechiul P30 face poze mai bune) - amprenta o recunoaste in 70-80% din cazuri din prima (P30 recunoaste amprenta 100% din cazuri din prima) Concluzie: - daca vrei ceva premium : Note 10 Plus(are Dynamic Amoled, 1440 x 3040 pixeli, 498 ppi) sau Note 20 Ultra (are Dynamic Amoles 2X, 1440 x 3088, 496 ppi) - daca nu te uiti la filme si iti doresti 256 GB memorie, atunci ia-ti Note 20. Daca insa vrei rezolutie buna, baterie buna si poze excelente, mergi spre Huawei
command: ('/usr/local/bin/ludwig experiment --gpus -1 --output_directory ' '/tmp/training_arena/08347ec2-6764-4427-b6f1-8f4f11dc2686_classifier_131 ' '--dataset ' '/tmp/training_arena/08347ec2-6764-4427-b6f1-8f4f11dc2686_classifier_131/dataset.csv ' '--config_file ' '/tmp/training_arena/08347ec2-6764-4427-b6f1-8f4f11dc2686_classifier_131/model_config.yaml')
config: { 'combiner': {'type': 'concat'}, 'input_features': [ { 'column': 'text', 'encoder': 'parallel_cnn', 'level': 'word', 'name': 'text', 'preprocessing': { 'lowercase': True, 'word_tokenizer': 'romanian_tokenize_punctuation'}, 'proc_column': 'text_i4HJUa', 'tied': None, 'type': 'text'}], 'output_features': [ { 'column': 'class', 'dependencies': [], 'loss': { 'class_similarities_temperature': 0, 'class_weights': 1, 'confidence_penalty': 0, 'labels_smoothing': 0, 'robust_lambda': 0, 'type': 'softmax_cross_entropy', 'weight': 1}, 'name': 'class', 'proc_column': 'class_mZFLky', 'reduce_dependencies': 'sum', 'reduce_input': 'sum', 'top_k': 3, 'type': 'category'}], 'preprocessing': { 'audio': { 'audio_feature': {'type': 'raw'}, 'audio_file_length_limit_in_s': 7.5, 'in_memory': True, 'missing_value_strategy': 'backfill', 'norm': None, 'padding_value': 0}, 'bag': { 'fill_value': '',
'lowercase': False,
'missing_value_strategy': 'fill_with_const',
'most_common': 10000,
'tokenizer': 'space'},
'binary': { 'fill_value': 0,
'missing_value_strategy': 'fill_with_const'},
'category': { 'fill_value': '',
'lowercase': False,
'missing_value_strategy': 'fill_with_const',
'most_common': 10000},
'date': { 'datetime_format': None,
'fill_value': '',
'missing_value_strategy': 'fill_with_const'},
'force_split': False,
'h3': { 'fill_value': 576495936675512319,
'missing_value_strategy': 'fill_with_const'},
'image': { 'in_memory': True,
'missing_value_strategy': 'backfill',
'num_processes': 1,
'resize_method': 'interpolate',
'scaling': 'pixel_normalization'},
'numerical': { 'fill_value': 0,
'missing_value_strategy': 'fill_with_const',
'normalization': None},
'sequence': { 'fill_value': '',
'lowercase': False,
'missing_value_strategy': 'fill_with_const',
'most_common': 20000,
'padding': 'right',
'padding_symbol': '',
'sequence_length_limit': 256,
'tokenizer': 'space',
'unknown_symbol': '',
'vocab_file': None},
'set': { 'fill_value': '',
'lowercase': False,
'missing_value_strategy': 'fill_with_const',
'most_common': 10000,
'tokenizer': 'space'},
'split_probabilities': (0.7, 0.1, 0.2),
'stratify': None,
'text': { 'char_most_common': 70,
'char_sequence_length_limit': 1024,
'char_tokenizer': 'characters',
'char_vocab_file': None,
'fill_value': '',
'lowercase': True,
'missing_value_strategy': 'fill_with_const',
'padding': 'right',
'padding_symbol': '',
'pretrained_model_name_or_path': None,
'unknown_symbol': '',
'word_most_common': 20000,
'word_sequence_length_limit': 256,
'word_tokenizer': 'romanian_tokenize_punctuation',
'word_vocab_file': None},
'timeseries': { 'fill_value': '',
'missing_value_strategy': 'fill_with_const',
'padding': 'right',
'padding_value': 0,
'timeseries_length_limit': 256,
'tokenizer': 'space'},
'vector': { 'fill_value': '',
'missing_value_strategy': 'fill_with_const'}},
'training': { 'batch_size': 32,
'bucketing_field': None,
'decay': False,
'decay_rate': 0.96,
'decay_steps': 10000,
'early_stop': 5,
'epochs': 100,
'eval_batch_size': 0,
'gradient_clipping': None,
'increase_batch_size_on_plateau': 0,
'increase_batch_size_on_plateau_max': 512,
'increase_batch_size_on_plateau_patience': 5,
'increase_batch_size_on_plateau_rate': 2,
'learning_rate': 0.001,
'learning_rate_warmup_epochs': 1,
'optimizer': { 'beta_1': 0.9,
'beta_2': 0.999,
'epsilon': 1e-08,
'type': 'adam'},
'reduce_learning_rate_on_plateau': 0,
'reduce_learning_rate_on_plateau_patience': 5,
'reduce_learning_rate_on_plateau_rate': 0.5,
'regularization_lambda': 0,
'regularizer': 'l2',
'staircase': False,
'validation_field': 'combined',
'validation_metric': 'loss'}}