machinebox / issues

Machine Box issues, bugs and feature requests
2 stars 0 forks source link

textbox check endpoint not responding #21

Closed j0hnsmith closed 6 years ago

j0hnsmith commented 6 years ago

/textbox/check isn't responding. I'm running textbox ok as I can load the web interface, here's a screenshot of the demo.

textbox demo

Log not showing anything

2018/03/05 10:31:32 Number of MB_WORKERS is set to 1
[INFO]     starting...

        Welcome to Textbox by Machine Box
        (textbox 5f794e7)

        Visit the console to see what this box can do:
        http://localhost:8080

        If you have any questions or feedback, get in touch:
        https://machinebox.io/contact

        Please consider buying a subscription:
        https://machinebox.io/#pricing

        Report bugs and issues:
        https://github.com/machinebox/issues

        Tell us what you build on Twitter @machineboxio

[INFO]     box ready

The connection is accepted but hangs

$ curl 'http://localhost:8080/textbox/check' -d 'text=some text' -v
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8080 (#0)
> POST /textbox/check HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Length: 14
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 14 out of 14 bytes
matryer commented 6 years ago

Interesting one... how much RAM and how many CPUs does your engine have?

matryer commented 6 years ago

Can you try stopping and restarting the box too?

j0hnsmith commented 6 years ago

It's running on GKE, no specific resource limits in the namespace (cpu: 100m in the global namespace). I tried adding some specific limits, when they are too low there's an error message command exited during start up.

It's working now so the restart 'fixed' it, if I see it again I'll shout here.

matryer commented 6 years ago

Hmm.. strange one. Yes, please keep us informed.

j0hnsmith commented 6 years ago

I can reproduce it pretty consistently with this kubernetes config, perhaps a race condition that shows itself when the box is starved of cpu? When I change the cpu limit to 500m it doesn't seem to happen.

---
apiVersion: v1
kind: Service
metadata:
  name: textbox
  namespace: machinebox
  labels:
    name: textbox
spec:
  ports:
    - name: http
      port: 8080
  selector:
    app: textbox
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: textbox
  namespace: machinebox
spec:
  replicas: 1
  revisionHistoryLimit: 3
  template:
    metadata:
      labels:
        app: textbox
    spec:
      containers:
        - name: textbox
          image: machinebox/textbox:latest
          imagePullPolicy: Always
          ports:
          - containerPort: 8080
            name: http
          resources:
            limits:
              cpu: "200m"
              memory: 2Gi
            requests:
              cpu: "100m"
              memory: 512Mi
          env:
            - name: MB_KEY
              valueFrom:
                secretKeyRef:
                  name: machinebox
                  key: key
            - name: MB_TEXTBOX_DISABLE_SENTIMENT
              value:  "true"
j0hnsmith commented 6 years ago

hmm, ignore above, is happening with 500m too. Seems to be random, sometimes it starts and returns some responses, then locks after a few minutes.

matryer commented 6 years ago

@dahernan Let's take a look at this later today.

dahernan commented 6 years ago

Hi, usually when textbox is struggling is related with memory and not with CPU, please try to give more memory, probably around 3GB for just MB_WORKERS=1 and 1GB more per extra worker.

Are the request very big? for example are you processing a big document? In that case if you don't have enough memory try it to break it, in more request

j0hnsmith commented 6 years ago

Requests were tiny, eg example text.

Haven't had any problems for a few days after setting 3GB memory. Maybe warning at startup if memory is less than say 2GB would help, at least the problem would be less silent.

dahernan commented 6 years ago

ohh cool, thanks!

I will have a look to memory consumption