Open indiejoseph opened 7 years ago
noted. Requirements Drafts:
Google Cloud Storage to store the uploaded images Serving trained model via Google Cloud ML Flask as Front-end Webserver and API
any missing user case?
@jccf091 ok can you try to build a simple server to serve a Keras ImageNet model first, just API is ok.
I am making a laravel server that allow public user (has ready designed to allow user session, but not ready) to submit images and connect to worker machine. The public api and the api between ml-server and the demo page are already done. But I have not yet write the ml-server. see https://flower.jackhftang.com I will later upload the source code.
@jackhftang Laravel server can be API gateway, but i dunno how can it work with ml-server, can you give me more detail? and how about the infrastructure? Thanks
@indiejoseph in short, each image has fields user_id
, image_id
, job_id
, image_url
, status
, model
and result
(with subtle created_at
, updated_at
). For users, they can only see image_id
and image_url
, status
, model
and result
. As for ml-server, they can only see job_id
and image_url
.
ml-servers need to pre-register to api-server with an unique name
and an endpoint
.
Once a user upload an image (together with an optional model
of choice) , the image will be stored in api-server, and the api-server will send a request to ml-server according model
. If no model
specified, it will randomly choose one.
ml-servers listen to the endpoint and with get a message containing job_id
and image_url
. The ml-servers can choose to reject or accept the job. And then, when the job is finished, it reply to api-server with job_id
and the result
. If ml-server choose to reject the job, api-server may choose another model or retry this ml-server later according model
.
An image has three status, namely, pending, processing, done. Initially, an image is in pending status. And change to processing, once ml-server reply an 'accept'. And be done, after ml-server post back a result.
Auth can build upon, it is even easier to implement than to integrate with third party services. Currently, job_id
and image_id
are 60 char alpha-num string and image_url
is 41 char alpha-num. And these three field are independent (in term of probability) and no one except api-server know all three fields for any image at once. IMO, the privacy of current setting is strong. Currently, all images belong to a virtual public user.
Other minor things, all uploaded image is currently resize to width 300px.
Detailed message formats and routes will be available after the prototype is done. It should be tonight or tmr night.
btw, for production use, I am looking for sponsors of domain and server. And I am happy to make a transfer. I am fine if eventually the app do not use this api-server.
@jccf091 what do you think?
@jackhftang @jccf091 Let me draw a diagram to describe the whole picture from APP to backend and authentication. stay tuned
forget to say, user hold the image_id
and poll for classification result.
@indiejoseph @jackhftang What described is too complex.
I don't think we need a job queue pattern to handle prediction for user submitted flower image.
To me, the easiest way to do is just to have an only API server to handle the incoming requests. The user should able to upload the lower image and get the prediction with in one or two requests. I don't expect running trained model against uploaded image will longer than 10 seconds.
Personally, I don't like long pulling. It introduces a lot of issues. And it should be used in some real-time features.
The reason why I suggest using "Flask" is to reduce overall complexity since we are going to write Python anyway.
If we need job queue pattern and need to push result to the front end, I would suggest using Phoenix Framework.
@indiejoseph @jackhftang Why do we need to explore "model" knowledge to the front end? One trained model to handle all image classification tasks sounds good to me. We should serve different model version in google cloud or somewhere, so that we can roll back to different version easily.
The reason why to separate a ml-server is
It is very likely more than one models. Models will evolve and probably you want to try new ideas from time to time. If you not just regarding the app and considering the processing of development, I guess you will want the ability to select a model and submit an image and see the result afterward or compare result with others.
hosting the api-server is easy, but ml-server could be memory-hungry or worst if the model can only run on gpu, you may not want to spend on long-living gpu instance for little request. The architecture allow the ml-server to be offline, or reply in much later time.
btw, it is not long polling.
forget to say, user hold the image_id and poll for classification result.
@jackhftang not long polling?
@jackhftang We can also choose to host the model to Google Cloud ML.
@jccf091
not long polling, the current design is to GET /api/v0/result/
You may have a look of mock ml-server here. I just have not yet integrate it with the trained resnet50 model. The web frontend is able to handle displaying result already. https://gist.github.com/jackhftang/2bb2bde6f601362a970c73cc7072f3ec
And I have no experience with Google Cloud ML.
Exploring ML Engine on Google Cloud https://medium.com/google-cloud/keras-inception-v3-on-google-compute-engine-a54918b0058
I have made minimum features, upload file and classify. Currently, images are resized and store in api server. Information of what image user own is stored in localStorage. An admin panel is provided by laravel + voyager, which provide basic media/file explorer and graphic database viewer/editor. The api server is behind cloudflare, which provide cdn for all images. Real time update is currently done by polling, and it is just one simple mysql key lookup. I expect commodity computer can handle 10k query/second.
As for the ml-server, it is currently hosting on a somehow decent machine . It took around 10second to load libraries and model (using cpu as backend) and be ready to serve. And it took around 1.5GB of memory, and spawn 114 threads in total. I read some articles about Google ML engine that its real-time api can respond in a second...
Anyway, let's use Google Cloud, I still have $300 usd coupon not yet used and will expire =] I leave this project as my portfolio.
Some request/response data format for reference
{
"lat": "float",
"long": "float",
"photo": "blob"
}
{
"predictions": [
{
"name": "string // flower name",
"probability": "float",
}, {
...
}
]
}
Nice
https://cloud.google.com/about/locations/ It seems like we can only deploy to TW.
Infrastructure for serving the trained model with RESTful API.
1.) API gateway for APP to upload a image 2.) pass the image to get prediction out of the trained model 3.) Infrastructure for serving the trained model, AWS / Google Cloud?