Closed kkristacia closed 1 month ago
Hi @kkristacia,
To load the model, you just need to run the same steps as creating the model. The only difference is that while calling model = AutoModelWithTabular.from_pretrained(...)
make sure you set the first argument pretrained_model_name_or_path
to the path that you saved your model in.
Similarly, to preprocess the inference dataset, I would recommend running load_data_from_folder
function with the same parameters used in the load_data_from_folder
while training. Use the same training data to reconstruct the encoders and replace the test data with your inference data. I know this isn't optimal so we'll definitely change this in a future version.
Please let me know if you run into any other issues and I can help you solve it! :)
Hi Akash, thanks for the clarification. Yea I was hoping for some way to not use the training data during inference. Definitely will be great if future versions can have the functionality!
Hi Akash. Just to second this - it would be great if the preprocessing objects were saved for making inferences in production. Loading my whole dataset into my production environment would take up space unnecessarily. Love the toolkit, and looking forward to seeing an update in the future!
Thanks @dsunart! I'm reopening this issue as a feature request. It should be added in as part of our next release!
Hey @kkristacia and @dsunart, happy to note that this is now part of the toolkit. You can see this in action in this example.
Hi thank you for developing this package! I want to be able to load the already saved model, then use it for inference like in production. How can I let the inference dataset to go through the same preprocessing steps eg. OneHotEncoding of categorical variables, scaling?