ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.22k stars 1.19k forks source link

Predict from stdin or via the programmatic api #87

Closed loretoparisi closed 1 month ago

loretoparisi commented 5 years ago

As soon as my brand new model is ready, I would like ludwig the enable predictions via the command line, so that one could do like

echo "call me at +3912345679 for offers" | ludwig predict --only_prediction --model_path /path/to/model -

where the - may indicate the stdin (as an example) and get the model predictions directly. This would help to integrated ludwing command in a inference pipeline.

If I have understood well the api (my model is not ready yet...) it should be possibile via the programmatic api like that:

from ludwig import LudwigModel

# load a model
model = LudwigModel.load(model_path)

# obtain predictions
myDict = { 'text': ['call me at +3912345679 for offers'] }
predictions = model.predict(data_dict=myDict,
  return_type=dict
  batch_size=128,
  gpus=None,
  gpu_fraction=1,
  logging_level=logging.DEBUG)

# close model (eventually)
model.close()

Is that correct?

w4nderlust commented 5 years ago

I see your usecase, it is interesting as it can be used inside bash scripts. it has a big downside I believe: you load the model every single time you call it, which is really slow. An alternative could be to start a Rest server (something we already planned to add), and then you can just curl that endpoint in your bash line. It has the overhead of calling through rest, but it's nothing compared to the overhead of loading the model every single time. What's your take on it?

loretoparisi commented 5 years ago

@w4nderlust thanks! yes that is true, the downside of loading the model must be addressed. This is I what I'm doing in fasttext.js, where I wrap fastText binary into a node.js library: The model is loaded once through the ludwig predict --model_path /path/to/model -. This because the predict (with that option) will call LudwigModel.load, and open the stdin and waits for line feed. Then I send to the forked process the text writing to the stdin (hence like stdin.write(call me at +3912345679 for offers\r\n)). In this way ludwig predict would be used in several ways: REST servers with a process fork, attached to bash process, etc. In my experience of FastText, the execution overhead is minimal (process fork and pipe to stdin) and it's faster than running programmatically, without losing the functionality. Since ludwig has a great command line, it would be a useful addon.

w4nderlust commented 5 years ago

Got it, that seems like a great idea. What i was planning to have a ludwig serve command that would start a REST server, maybe I can either create another one or put an option if to start the server or if to listen to stdin. The reason for not using predict for this purpose is because it will be nice to expose also model.train() and model.train_online() and basically any other function of the API. The only difficulty that I can imagine being there with the stdin reading is that the input should be in kind of a tabular encoding, because methods need to know which feature the value belongs to, so for instance your example should be stdin.write(text: call me at +3912345679 for offers, other_feature: other_value\r\n). Plus more than one sample may be passed at the same time, which means that probably you want to encode it in something like JSON: stdin.write({"text": ["call me at +3912345679 for offers", "text_2"], "other_feature": [other_value_1, other_value_2]}). I think that would work, what do you think? Maybe JSON will make escaping characters a bit of mess...

loretoparisi commented 5 years ago

@w4nderlust the JSON approach for input / output would be awesome! Most of the problem we have with pre-processing in multiple languages (so double bytes as well ) is due to Unicode handling, etc. so a JSON interface would solve this definitively. Maybe the ludwing command line could have a specific option for this stdin mode like in the --data_csv for json --data_json and then for the inline output a specific predict, so to avoid exposing any model internal api in that way.

w4nderlust commented 5 years ago

I added it to the list of enhancements! Stay tuned.

Nasnl commented 5 years ago

I've been using Ludwig to classify tweets into several categories. Now I only realized this by putting them in CSV-files, train and then for predictions do the same, but of course I would like to predict the class for each tweet as it becomes available. The JSON approach would be awesome as it would allow me to add some more meta data and process everything (near) real time with hassling around with CSV-files.

cesarochoa2006 commented 5 years ago

After some thinking, i have resolved that I'm planning to implement a ludwig environment with an Angular-based gui and a rest project in python's django (Rest services in django are enterprise ready and easy peasy), to do something a little more useful (Think for example an docker container ready to execute). Before i start coding, i want know your opinions and discuss about it. Do you think it could be good or bad or thought on something better?

w4nderlust commented 5 years ago

@cesarochoa2006 i was planning to implement it in Flask for simplicity because i used it in the past, but I'm not an expert in these web dev frameworks, maybe you can give me some reasons why one may want to use one or the other. I guess at the moment the main limitation anyway comes from blob features like images and in the near future audio, because ludwig expects a path to a file to lad as input and if you are writing a server you may get an image as a bytestream directly I imagine. I don't have a solution for that at the moment that is clean and nice, the workaround would be to save to the local disk, obtain a path and pass it to ludwig and later delete, which seems not at all clean, but may work in the mean time I find a better solution. Anyway, if you are interested in contributing the server that you are writing (it would be great and much appreciated!) we can chat more and figure out something together.

cesarochoa2006 commented 5 years ago

@w4nderlust For now is only an idea i have jaja. I'm not already begin but i will soon. Anyway, in the mean time i have no issues on flask or django (for the sake or simplicity flask is more widely used to very simple projects and both flask or django have pros and cons, and i think it is more a personal decision to use one or another). The only strong argument that I can get to go to django is the ease it offers handling files and that I have handled some AI projects with it.

loretoparisi commented 5 years ago

Just adding my 2 cents here. For our cluster installation of python server we run Tornado. It let you to leverage multi-threaded applications or gunicord the WSGI approach. I actually prefer Tornado because typically using a singleton mixin pattern, so that all the models (like TF / PyTorch graphs) are backed in memory once and shared across threads and all instances at every api call. Of course from the framework perspective, there is some middleware component to fill the gap between the framework and the front-end at runtime in order to serve byte streams. About this, once approach, that was provided by Stanford CoreNLP was the consider every task as a whole pipeline, so that all resources are statically loaded at startup of the CoreNLP modules, so you have all NLP annotators loading resources (from datasets to models to configurations) once, and if you add new annotators later on at run-time, they will share a singleton instance (That is actually a static class), that handles all the things.