Closed muhammad-yasir closed 3 years ago
@muhammad-yasir
Re:
To predict/test the test dataset, I am not sure how to use the json files generated by the above command. I think I should use predict command to use the train model on the test data. So I wrote the following command:
ludwig predict --data_set test.csv --model_path 'path of json file'
The model_path
parameter does not refer to the json file. This parameter refers to the model
sub-directory. For example, if you run with the default settings for the parameters: experiment_name
, model_name
and output_directory
, you should see the following directory structures:
results/
experiment_run/
model/
Note the above is only a subset of what will be created. Also note that the sub-directory experiment_run
is the concatenation of the experiment_name
and model_name
parameters. the --model_path
parameter should point to --model_path results/experiment_run/model
Please refer to the Ludwig user guide for details.
Let me know if this answered your question.
Hi Jim,
Thanks for your response. I tried your instruction. I first trained the model using default setting using following command: ludwig experiment --dataset train.csv --config_file model_definition.yml -kf=10
When the training completed it does not save into the directory structure as mentioned in your reply (i.e. results/experiment_run/model). In fact, I noticed that the output of the experiment is not saving into the directory where the file is .yml file is located. for example, I have attached the screen shot of CLI which says where the model is saved.
It looks like program has saved into temp folder. Any thoughts?
@muhammad-yasir The intent of the k_fold
parameter is to provide an assessment of the model performance that is more robust than a single performance report on a single test dataset. This is done by building by building a model on the different folds of the training data and reporting the performance on each different fold. In other words, there are as many models built with a different subset of the training data. The intent of experiment
with the k_fold
option is to provide model performance assessment not provide a single model.
If you need a model to make predictions, then you use train
or experiment
w/o the k_fold
option. This should provide a model in the location that is described in the earlier posting.
Hope this helps. If not feel free to continue the dialog.
Thanks, Jim. I really appreciate your response. Now I understood the purpose.
Hi there,
I am newbie and trying to learn the ludwig.
I am using ludwig=0.3.3. in CLI.
I am trying to train the model using cross-fold validation. I have test and train the dataset separately. I read the documentation (https://ludwig-ai.github.io/ludwig-docs/user_guide/#experiment), according to documentation Ludwig experiment command combines training and evaluation into a single command. So I used the following command to run the experiment: ludwig experiment --data_set train_set.csv -cf model_definition.yml --output_directory results -kf=10
The above command generated two json files (i.e. kfold_split_indices and kfold_training_statistics).
To predict/test the test dataset, I am not sure how to use the json files generated by the above command. I think I should use predict command to use the train model on the test data. So I wrote the following command:
ludwig predict --data_set test.csv --model_path 'path of json file'
But the above command did not test the data.
Could you please help me where am I making the mistake?