ThomasDelteil / VisualSearch_MXNet

Visual Search using Apache MXNet and gluon
236 stars 54 forks source link

Issue creating model archive #11

Closed klvnmarshall closed 5 years ago

klvnmarshall commented 5 years ago

Does the --handler service in this case has an entry point function??

ThomasDelteil commented 5 years ago

yes the mxnet model server has been quite upgraded since I did this project, I would recommend to refer to the mxnet model server doc. @OElesin did you create an updated version for the latest version of MMS ? Would you like to contribute it?

klvnmarshall commented 5 years ago

I think we need to add a handler handle() function to our visualservice.py _service = VisualSearchService() def handle(data, context): if not _service.initialized: _service.initialize(context) if data is None: return None return _service.handle(data, context)

An then I created the archive using this command model-archiver --model-name visualsearch --model-path visualsearch --handler visualservice:handle

But i tried this and encountered an error

2019-04-16 12:55:34,662 [INFO ] W-9001-similar-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - _service = VisualSearchService() 2019-04-16 12:55:34,662 [INFO ] W-9001-similar-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - TypeError: __init__() missing 3 required positional arguments: 'model_name', 'model_dir', and 'manifest'

MaxTran96 commented 5 years ago

Hello, this is the code I wrote for the service file with Thomas's help: You can keep the _postprocess() the same For the initialization. This is how you make an entry point class VisualSearchService():

def __init__(self):
    self.initialized = False
def initialize(self):
    ############################################
    logging.info('Downloading Resources Files')

    data_dir = os.environ.get('DATA_DIR', 'mms/')
    if not os.path.isdir(data_dir):
        os.makedirs(data_dir)
    index_url = os.environ.get('INDEX_URL', INDEX_URL)
    idx_to_ASIN_url = os.environ.get('IDX_ASIN_URL', IDX_ASIN_URL)
    ASIN_to_data_url = os.environ.get('ASIN_DATA_URL', ASIN_DATA_URL)

    mx.test_utils.download(index_url, dirname=data_dir)
    mx.test_utils.download(idx_to_ASIN_url, dirname=data_dir)
    mx.test_utils.download(ASIN_to_data_url, dirname=data_dir)
    ############################################

    ############################################
    logging.info('Loading Resources files')

    self.idx_ASIN = pickle.load(open(os.path.join(data_dir, 'idx_ASIN.pkl'), 'rb'))
    self.ASIN_data = pickle.load(open(os.path.join(data_dir,'ASIN_data.pkl'), 'rb'))        
    self.p = hnswlib.Index(space = 'l2', dim = EMBEDDING_SIZE)
    self.p.load_index(os.path.join(data_dir,'index.idx'))
    ############################################

    logging.info('Resources files loaded')

    self.p.set_ef(EF)        
    self.k = K
    self.initialized= True

for _preprocess(), depend on how you put in the image data to the model. You can have it like this using base64encoding: def _preprocess(self, data):

    image_bytes = base64.b64decode(data)
    image_PIL = Image.open(io.BytesIO(image_bytes))
    image_np = np.array(image_PIL)
    image_t = transform(nd.array(image_np[:, :, :3]))
    image_batchified = image_t.expand_dims(axis=0).as_in_context(ctx)
    return [image_batchified]

Then at the bottom, you can write the handle() in this format svc = VisualSearchService() def handle(data,context): res="" if not svc.initialized: svc.initialize() if data is not None: x = svc._preprocess(data,context) res = svc._postprocess(x) return res svc.initialize() path = '1.jpg' #u can change this with open(path, "rb") as image_file: y = base64.b64encode(image_file.read()) x = svc._preprocess(y) ans = svc._postprocess(x) Hope this helps :D

klvnmarshall commented 5 years ago

Thanks @MaxTran96 for the answer. Let me have a look at it and then I,ll get back to u.

klvnmarshall commented 5 years ago

So far so good My model archive is loaded thanks to @MaxTran96 But I am having troubles sending curl commands from the terminal using the input as a base64 encoded image sent as FormData in the data field

as referenced here #9 by @ThomasDelteil

MaxTran96 commented 5 years ago

If you're using Curl, then you shouldn't be using the base64encoding since it will accept raw bytes, so change the _preprocess() to accept the raw byte format def _preprocess(self, data):

    image_bytes = data[0]['body']
    image_PIL = Image.open(io.BytesIO(image_bytes))
    image_np = np.array(image_PIL)
    image_t = transform(nd.array(image_np[:, :, :3]))
    image_batchified = image_t.expand_dims(axis=0).as_in_context(ctx)
    return [image_batchified]
Let me know how it goes :D
klvnmarshall commented 5 years ago

Thanks @MaxTran96 I got everything working

But I'm getting inconsistent results so different compared to the results in my notebook .. What is going on ?? are the input images not sent through the network params. Please help me figure it out.

klvnmarshall commented 5 years ago

from this code I found that no _inference function call and that the service executes _preprocess then directly executes _postprocess svc = VisualSearchService() def handle(data,context): res="" if not svc.initialized: svc.initialize() if data is not None: x = svc._preprocess(data,context) res = svc._postprocess(x) return res

I used a lazy approach and added this line to my code

self.net=vision.resnet18_v2(pretrained=True, ctx=ctx).features

and at my _preprocess function I passed the transformed image array through the network

image_batchified = self.net(image_t.expand_dims(axis=0).as_in_context(ctx))

I know there is a better approach using MXNetBaseService but I found this to be the quickest for my needs

Issue solved.