Ludwig not running on Anaconda, suddenly stopping just before starting to Train.

MarijnQ commented 1 year ago

When I run Ludwig in Anaconda, I get an experiment description, the data is preprocessed, and I see "model", "Warnings and other logs", however these stay empty.

I am running Python 3 (ipykernel) in Jupyter Notebook 6.4.8 on a Macbook Pro M1 running on Ventura 13.1 (22C65)

Whats going wrong here? It does run on Google Colab, (with PyTorch, tried this on the Jupyter Notebook, but no success either), I just want to train my data on my laptop making use of the M1 chip.

I'm running the following code:

!pip install tensorflow
import pandas as pd
from datetime import datetime as dt
import numpy as np
from pandas.core.base import value_counts

///importing and cleaning dataset and saving it as 'CompanyAndIndust.csv'

model_definition="""

input_features:
    -
        name: name
        type: text
        level: word
        encoder: parallel_cnn

output_features:
    -
        name: industry
        type: text

"""

with open("model_definition.yaml", "w") as f:
  f.write(model_definition)

!ludwig experiment \
 --dataset CompanyAndIndust.csv\
 --config model_definition.yaml

I get no errors or anything, it just stops saying the following:

Note: NumExpr detected 10 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
NumExpr defaulting to 8 threads.
███████████████████████
█ █ █ █  ▜█ █ █ █ █   █
█ █ █ █ █ █ █ █ █ █ ███
█ █   █ █ █ █ █ █ █ ▌ █
█ █████ █ █ █ █ █ █ █ █
█     █  ▟█     █ █   █
███████████████████████
ludwig v0.6 - Experiment

╒════════════════════════╕
│ EXPERIMENT DESCRIPTION │
╘════════════════════════╛

╒══════════════════╤═══════════════════════════════════════════════════════════════════════════════╕
│ Experiment name  │ experiment                                                                    │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ Model name       │ run                                                                           │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ Output directory │ /Users/marijnquartel/Documents/Data/Industry Report/results/experiment_run_10 │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ ludwig_version   │ '0.6'                                                                         │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ command          │ ('/Users/marijnquartel/opt/anaconda3/bin/ludwig experiment --dataset '        │
│                  │  'CompanyAndIndust.csv --config model_definition.yaml')                       │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ random_seed      │ 42                                                                            │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ dataset          │ 'CompanyAndIndust.csv'                                                        │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ data_format      │ 'csv'                                                                         │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ torch_version    │ '1.13.1'                                                                      │
├──────────────────┼───────────────────────────────────────────────────────────────────────────────┤
│ compute          │ {'num_nodes': 1}                                                              │
╘══════════════════╧═══════════════════════════════════════════════════════════════════════════════╛

╒═══════════════╕
│ LUDWIG CONFIG │
╘═══════════════╛

{   'combiner': {   'activation': 'relu',
                    'bias_initializer': 'zeros',
                    'dropout': 0.0,
                    'fc_layers': None,
                    'flatten_inputs': False,
                    'norm': None,
                    'norm_params': None,
                    'num_fc_layers': 0,
                    'output_size': 256,
                    'residual': False,
                    'type': 'concat',
                    'use_bias': True,
                    'weights_initializer': 'xavier_uniform'},
    'defaults': {   'audio': {   'preprocessing': {   'audio_file_length_limit_in_s': 7.5,
                                                      'computed_fill_value': None,
                                                      'fill_value': None,
                                                      'in_memory': True,
                                                      'missing_value_strategy': 'bfill',
                                                      'norm': None,
                                                      'num_fft_points': None,
                                                      'num_filter_bands': 80,
                                                      'padding_value': 0.0,
                                                      'type': 'fbank',
                                                      'window_length_in_s': 0.04,
                                                      'window_shift_in_s': 0.02,
                                                      'window_type': 'hamming'}},
                    'bag': {   'preprocessing': {   'computed_fill_value': '<UNK>',
                                                    'fill_value': '<UNK>',
                                                    'lowercase': False,
                                                    'missing_value_strategy': 'fill_with_const',
                                                    'most_common': 10000,
                                                    'tokenizer': 'space'}},
                    'binary': {   'preprocessing': {   'computed_fill_value': None,
                                                       'fallback_true_label': None,
                                                       'fill_value': None,
                                                       'missing_value_strategy': 'fill_with_false'}},
                    'category': {   'preprocessing': {   'computed_fill_value': '<UNK>',
                                                         'fill_value': '<UNK>',
                                                         'lowercase': False,
                                                         'missing_value_strategy': 'fill_with_const',
                                                         'most_common': 10000}},
                    'date': {   'preprocessing': {   'computed_fill_value': '',
                                                     'datetime_format': None,
                                                     'fill_value': '',
                                                     'missing_value_strategy': 'fill_with_const'}},
                    'h3': {   'preprocessing': {   'computed_fill_value': 576495936675512319,
                                                   'fill_value': 576495936675512319,
                                                   'missing_value_strategy': 'fill_with_const'}},
                    'image': {   'preprocessing': {   'computed_fill_value': None,
                                                      'fill_value': None,
                                                      'height': None,
                                                      'in_memory': True,
                                                      'infer_image_dimensions': True,
                                                      'infer_image_max_height': 256,
                                                      'infer_image_max_width': 256,
                                                      'infer_image_num_channels': True,
                                                      'infer_image_sample_size': 100,
                                                      'missing_value_strategy': 'bfill',
                                                      'num_channels': None,
                                                      'num_processes': 1,
                                                      'resize_method': 'interpolate',
                                                      'scaling': 'pixel_normalization',
                                                      'width': None}},
                    'number': {   'preprocessing': {   'computed_fill_value': 0.0,
                                                       'fill_value': 0.0,
                                                       'missing_value_strategy': 'fill_with_const',
                                                       'normalization': None}},
                    'sequence': {   'preprocessing': {   'computed_fill_value': '<UNK>',
                                                         'fill_value': '<UNK>',
                                                         'lowercase': False,
                                                         'max_sequence_length': 256,
                                                         'missing_value_strategy': 'fill_with_const',
                                                         'most_common': 20000,
                                                         'padding': 'right',
                                                         'padding_symbol': '<PAD>',
                                                         'tokenizer': 'space',
                                                         'unknown_symbol': '<UNK>',
                                                         'vocab_file': None}},
                    'set': {   'preprocessing': {   'computed_fill_value': '<UNK>',
                                                    'fill_value': '<UNK>',
                                                    'lowercase': False,
                                                    'missing_value_strategy': 'fill_with_const',
                                                    'most_common': 10000,
                                                    'tokenizer': 'space'}},
                    'text': {   'preprocessing': {   'computed_fill_value': '<UNK>',
                                                     'fill_value': '<UNK>',
                                                     'lowercase': True,
                                                     'max_sequence_length': 256,
                                                     'missing_value_strategy': 'fill_with_const',
                                                     'most_common': 20000,
                                                     'padding': 'right',
                                                     'padding_symbol': '<PAD>',
                                                     'pretrained_model_name_or_path': None,
                                                     'tokenizer': 'space_punct',
                                                     'unknown_symbol': '<UNK>',
                                                     'vocab_file': None}},
                    'timeseries': {   'preprocessing': {   'computed_fill_value': '',
                                                           'fill_value': '',
                                                           'missing_value_strategy': 'fill_with_const',
                                                           'padding': 'right',
                                                           'padding_value': 0.0,
                                                           'timeseries_length_limit': 256,
                                                           'tokenizer': 'space'}},
                    'vector': {   'preprocessing': {   'computed_fill_value': '',
                                                       'fill_value': '',
                                                       'missing_value_strategy': 'fill_with_const',
                                                       'vector_size': None}}},
    'input_features': [   {   'column': 'name',
                              'encoder': {   'level': 'word',
                                             'type': 'parallel_cnn'},
                              'name': 'name',
                              'proc_column': 'name_mZFLky',
                              'tied': None,
                              'type': 'text'}],
    'ludwig_version': '0.6',
    'model_type': 'ecd',
    'output_features': [   {   'column': 'industry',
                               'decoder': {'type': 'generator'},
                               'dependencies': [],
                               'loss': {   'class_similarities_temperature': 0,
                                           'class_weights': None,
                                           'confidence_penalty': 0.0,
                                           'robust_lambda': 0,
                                           'type': 'sequence_softmax_cross_entropy',
                                           'unique': False,
                                           'weight': 1.0},
                               'name': 'industry',
                               'preprocessing': {   'missing_value_strategy': 'drop_row'},
                               'proc_column': 'industry_mZFLky',
                               'reduce_dependencies': 'sum',
                               'reduce_input': 'sum',
                               'type': 'text'}],
    'preprocessing': {   'oversample_minority': None,
                         'sample_ratio': 1.0,
                         'split': {   'probabilities': [0.7, 0.1, 0.2],
                                      'type': 'random'},
                         'undersample_majority': None},
    'trainer': {   'batch_size': 128,
                   'checkpoints_per_epoch': 0,
                   'decay': False,
                   'decay_rate': 0.96,
                   'decay_steps': 10000,
                   'early_stop': 5,
                   'epochs': 100,
                   'eval_batch_size': None,
                   'evaluate_training_set': True,
                   'gradient_clipping': {   'clipglobalnorm': 0.5,
                                            'clipnorm': None,
                                            'clipvalue': None},
                   'increase_batch_size_eval_metric': 'loss',
                   'increase_batch_size_eval_split': 'training',
                   'increase_batch_size_on_plateau': 0,
                   'increase_batch_size_on_plateau_max': 512,
                   'increase_batch_size_on_plateau_patience': 5,
                   'increase_batch_size_on_plateau_rate': 2.0,
                   'learning_rate': 0.001,
                   'learning_rate_scaling': 'linear',
                   'learning_rate_warmup_epochs': 1.0,
                   'optimizer': {   'amsgrad': False,
                                    'betas': (0.9, 0.999),
                                    'eps': 1e-08,
                                    'lr': 0.001,
                                    'type': 'adam',
                                    'weight_decay': 0.0},
                   'reduce_learning_rate_eval_metric': 'loss',
                   'reduce_learning_rate_eval_split': 'training',
                   'reduce_learning_rate_on_plateau': 0.0,
                   'reduce_learning_rate_on_plateau_patience': 5,
                   'reduce_learning_rate_on_plateau_rate': 0.5,
                   'regularization_lambda': 0.0,
                   'regularization_type': 'l2',
                   'should_shuffle': True,
                   'staircase': False,
                   'steps_per_checkpoint': 0,
                   'train_steps': None,
                   'type': 'trainer',
                   'validation_field': 'combined',
                   'validation_metric': 'loss'}}

╒═══════════════╕
│ PREPROCESSING │
╘═══════════════╛

Found cached dataset and meta.json with the same filename of the dataset, using them instead
Using full hdf5 and json
Loading data from: CompanyAndIndust.training.hdf5
Loading data from: CompanyAndIndust.validation.hdf5
Loading data from: CompanyAndIndust.test.hdf5

Dataset Statistics
╒════════════╤═══════════════╤════════════════════╕
│ Dataset    │   Size (Rows) │ Size (In Memory)   │
╞════════════╪═══════════════╪════════════════════╡
│ Training   │       1358000 │ 290.10 Mb          │
├────────────┼───────────────┼────────────────────┤
│ Validation │        194000 │ 41.44 Mb           │
├────────────┼───────────────┼────────────────────┤
│ Test       │        388000 │ 82.89 Mb           │
╘════════════╧═══════════════╧════════════════════╛

╒═══════╕
│ MODEL │
╘═══════╛

Warnings and other logs:

tgaddair commented 1 year ago

Hey @MarijnQ, to clarify the behavior you're seeing: is the process hanging or is it exiting out? If it's exiting, do you happen to know what the exit code is?

MarijnQ commented 1 year ago

@tgaddair so in Anaconda I have it all in code blocks. It just finishes the block and leaves it there. No exit code/ error code. I can launch a new block if I want and the machine works, but it Ludwig won't train

tgaddair commented 1 year ago

Thanks @MarijnQ. I wonder if there's an error message getting swallowed by the notebook. A couple things to try:

Check the server logs from the notebook server to see if there's anything like an error message in there.
Try running the same code outside of the notebook, using an ordinary Python script or command-line, and see if it raises an error.

If neither of those work, I would try making sure our example scripts run, like the titanic example we have here to check if the error is specific to your dataset / model config.

One other thing I'll mention is that we just landed support for M1 acceleration with MPS. To try it out, make sure you have the master branch of Ludwig installed and set LUDWIG_ENABLE_MPS=1 in the environment.

MarijnQ commented 1 year ago

@tgaddair Amazing, I'll try these tomorrow! 👍

MarijnQ commented 1 year ago

@tgaddair Alright, so when I set LUDWIG_ENABLE_MPS=1I get the error ModuleNotFoundError: No module named 'mlflow'. On Jupyter Notebook I can't get log files apparently, haven't found a way to get those.

I tried running it in a mac terminal, made changes to the code so it should run, but it keeps popping up with syntax or indent errors. Is there another environment in which I could try it?

This is the current code I run now, not the one from the terminal:

!pip install 'ludwig[full]'--LUDWIG_ENABLE_MPS=1

!pip install torch -f https://download.pytorch.org/whl/cu113/torch_stable.html

import pandas as pd
from datetime import datetime as dt
import numpy as np
from pandas.core.base import value_counts

df2 = pd.read_csv("/Users/marijnquartel/Documents/Data/Industry Report/CompanyName_Industry.csv", index_col=0)

df2 = df2.replace({'industry': {'non-profit organization management': 'philanthropy', 'motion pictures and film':'entertainment', 'music':'entertainment','performing arts':'entertainment','law practice':'legal services','e-learning':'education','education management':'education','higher education':'education','media production':'entertainment','primary/secondary education':'education'}})

#Delete every category that has less than 500 entries
thresholdVal = 1000
df = df2[df2.groupby("industry")["industry"].transform('size')>=thresholdVal]

#Create a sample of the dataset in equal sizes based on industry 
dataset = df.groupby('industry').apply(lambda x: x.sample(200,replace=True))
dataset.reset_index(drop=True, inplace=True)

dataset['industry'].replace('\s+', '_',regex=True,inplace=True)
dataset['industry'].replace('&', 'and',regex=True,inplace=True)
dataset['industry'].replace('/', '_or_',regex=True,inplace=True)
dataset['industry'].replace('-', '_',regex=True,inplace=True)
dataset['industry'].replace(',', '_',regex=True,inplace=True)
dataset.to_csv('CompanyAndIndust.csv')

model_definition="""

input_features:
    -
        name: name
        type: text
        level: word
        encoder: parallel_cnn

output_features:
    -
        name: industry
        type: text

"""

with open("model_definition.yaml", "w") as f:
  f.write(model_definition)

!ludwig experiment \
    --dataset CompanyAndIndust.csv\
    --config model_definition.yaml
--dataset CompanyAndIndust.csv\
--config model_definition.yaml

MarijnQ commented 1 year ago

I just tried running the rotten tomatoes set you guys have in the getting started section on the website.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Feb 14 09:15:23 2023

@author: marijnquartel
"""

!pip install ludwig
!pip install tensorflow

import pandas as pd

df = pd.read_csv('/Users/marijnquartel/Downloads/rotten_tomatoes.csv')

model_definition="""

input_features:
    - name: genres
      type: set
      preprocessing:
          tokenizer: comma
    - name: content_rating
      type: category
    - name: top_critic
      type: binary
    - name: runtime
      type: number
    - name: review_content
      type: text
      encoder: embed
output_features:
    - name: recommended
      type: binary
"""

with open("model_definition.yaml", "w") as f:
  f.write(model_definition)

from ludwig.api import LudwigModel

!ludwig experiment \
 --dataset df\
 --config model_definition.yaml

This raises the error ModuleNotFoundError: No module named 'mlflow'

MarijnQ commented 1 year ago

Here I am again. I have tried running Colab on a local runtime and I go this error message:

/Users/marijnquartel/opt/anaconda3/lib/python3.9/site-packages/torch/nn/modules/conv.py:309: UserWarning: Using padding='same' with even kernel lengths and odd dilation may require a zero-padded copy of the input be created (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/Convolution.cpp:896.)
  return F.conv1d(input, weight, bias, self.stride,

Got nothing when I enabled the MPS

ludwig-ai / ludwig

Ludwig not running on Anaconda, suddenly stopping just before starting to Train. #3059