openaustralia / yinyo

A wonderfully simple API driven service to reliably execute many long running scrapers in a super scaleable way
https://yinyo.io
Apache License 2.0
6 stars 1 forks source link

Test scraper seems to fail silently. #62

Closed jamezpolley closed 4 years ago

jamezpolley commented 4 years ago

I'm following the guide in https://github.com/openaustralia/morph-ng/blob/master/README.md

I've created a minikube cluster, installed kubedb, made buckets, and now I'm running my first scraper.

The scraper job seems to be failing and I'm not sure why.

Logs from the pod:

runName test-scrapers-test-python-5llhm
runToken QnH8HBemWDZTZISfKtGONUV7WG3ScSFx
runOutput data.sqlite
serverURL http://clay-server.clay-system:8080
buildCommand /bin/herokuish buildpack build
runCommand /bin/herokuish procfile start scraper
root.go:164: EOF

Looking through the dashboard it seems as though the clay-server service might not be coming up completely.

[james@bully:~/src/oaf/morph-ng/morph-ng] master* ± kubectl get --namespace=clay-system service
NAME            TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
clay-server     LoadBalancer   10.109.201.168   <pending>     8080:32345/TCP   11m
kubedb          ClusterIP      None             <none>        <none>           11m
minio-service   LoadBalancer   10.108.210.84    <pending>     9000:31016/TCP   11m
redis           ClusterIP      10.105.233.48    <none>        6379/TCP         11m
[james@bully:~/src/oaf/morph-ng/morph-ng] master* ± kubectl describe --namespace=clay-system service clay-server
Name:                     clay-server
Namespace:                clay-system
Labels:                   app.kubernetes.io/managed-by=skaffold-v1.0.0
                          skaffold.dev/builder=local
                          skaffold.dev/cleanup=true
                          skaffold.dev/deployer=kustomize
                          skaffold.dev/docker-api-version=1.39
                          skaffold.dev/run-id=84011c9e-1c96-475c-b2a4-ce9dc2e37844
                          skaffold.dev/tag-policy=envTemplateTagger
                          skaffold.dev/tail=true
Annotations:              kubectl.kubernetes.io/last-applied-configuration:
                            {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app.kubernetes.io/managed-by":"skaffold-v1.0.0","skaffold.dev/...
Selector:                 app=clay-server
Type:                     LoadBalancer
IP:                       10.109.201.168
Port:                     <unset>  8080/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  32345/TCP
Endpoints:                172.17.0.5:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
mlandauer commented 4 years ago

I think this is a bug which is marked as a "TODO" in the code

// TODO: Don't fail if the cache doesn't yet exist

I'll work on fixing this right now.

mlandauer commented 4 years ago

I can reproduce the bug by removing the cache file at assets/client-storage/cache/test/scrapers/test-python.tgz and running make.

mlandauer commented 4 years ago

Actually that TODO in the code was out of date. The actual problem is in the client.sh code which is uploading an empty cache file which confuses things. I've also added #63 to improve the developer experience by adding an error when you try to upload an incorrectly formatted file.

mlandauer commented 4 years ago

@jamezpolley I'll just reopen this issue and assign it to you. Could you please verify that a48263f662fab2a7fd19a089465329309234cbfb actually does fix things for you? If it does please close this issue. Thanks!

jamezpolley commented 4 years ago

Yes, definitely not seeing this now.