vmware-archive / kubeless

Kubernetes Native Serverless Framework
https://kubeless.io
Apache License 2.0
6.86k stars 755 forks source link

serverless function for machine learning modules (nltk library) #861

Open ps2420 opened 6 years ago

ps2420 commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

What happened:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

ps2420 commented 6 years ago

requirements.txt

source code of the python module which was used: import sys import numpy as np import random import unidecode import html import pathos.multiprocessing as mp import nltk from nltk.tokenize.treebank import TreebankWordTokenizer, TreebankWordDetokenizer from nltk.tokenize.moses import MosesTokenizer, MosesDetokenizer import os os.environ['CUDA_VISIBLE_DEVICES'] = '-1' import tensorflow as tf

from keras import layers from keras.models import Model from keras import optimizers from keras.callbacks import ModelCheckpoint

def hello(event, context): return hello2(event, context)

def hello2(event, context): subject = MyClass() return subject.methodA(event['data']);

def hasnum(w): for c_i in w: if c_i.isdigit(): return True return False

def binarize(w, alph, augment=False): bin_initial = [0]len(alph) bin_middle = [0]len(alph) bin_end = [0]*len(alph)

if w != "UNK":
    if augment and len(w)>3:
        w_mid = ''.join(random.sample(w[1:-1], len(w[1:-1])))
        w = w[0] + w_mid + w[-1]

    for i in range(len(w)):
        try:
            if i==0:
                bin_initial[alph.index(w[i])] += 1
            elif i==len(w)-1:
                bin_end[alph.index(w[i])] += 1
            else:
                bin_middle[alph.index(w[i])] += 1
        except ValueError:
            return np.array([0]*len(alph)*3), w

bin_all = bin_initial + bin_middle + bin_end
return np.array(bin_all), w

class MyClass:

def init(self): self.contents = dict();

def get(self, key): return self.contents.get(key, None);

def methodA(self, x): return ('methodA was called within myClass with argument ==> ' + str(x));

andresmgot commented 6 years ago

hi @ps2420,

It seems that the problem is that the library is trying to generate files in a privileged path /nltk/data. Kubeless functions by default use an unprivileged user for security reasons. You can disable this feature changing the Kubeless configuration to remove the securityContext default value and restarting the controller:

$ kubectl patch configmap -n kubeless kubeless-config \
 -p '{"data":{"deployment":"{\"spec\":{\"template\":{\"spec\":{\"securityContext\":{}}}}}"}}'
$ kubectl delete pod -n kubeless -l kubeless=controller

After that your functions will be running as root.

alexander-alvarez commented 6 years ago

Another option that worked for me on a relating task if you can afford to and have access to do so is write intermediate files under /tmp/