Not able to connect to the server

Patechoc commented 5 years ago

Hi,

here is my ~/.studioml/config.yaml configuration file

database:
    type: http
    serverUrl: https://zoo.studio.ml:5000
    authentication: github
    guest: true

server:
  authentication: github

storage:
    type: gcloud
    bucket: studio-ed756.appspot.com

queue: local

saveMetricsFrequency: 1m
saveWorkspaceFrequency: 1m
verbose: error

cloud:
    gcloud:
        zone: us-east1-c

resources_needed:
    cpus: 2
    ram:  3g
    hdd:  60g
    gpus: 0

sleep_time: 1
worker_timeout: 30

optimizer:
    cmaes_config:
        popsize: 100
        sigma0: 0.25
        load_best_only: false
    load_checkpoint_file:
    visualization: true
    result_dir: "~/Desktop/"
    checkpoint_interval: 0
    termination_criterion:
        generation: 5
        fitness: 999
        skip_gen_thres: 1.0
        skip_gen_timeout: 30

I do want to authenticate, but it seems not to be working for me either. Here is the error I get:

$ studio ui
studio-ui
Starting Studio UI on port 5000
 * Serving Flask app "studio.apiserver" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: off
2018-12-21 10:56:20 ERROR  flask.app - Exception on /api/get_user_experiments [POST]
Traceback (most recent call last):
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/studio/apiserver.py", line 171, in get_user_experiments
    user, blocking=True)]
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/studio/http_provider.py", line 198, in get_user_experiments
    data=json.dumps({"user": user}))
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/requests/api.py", line 116, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/patrick/anaconda3/envs/studio27/lib/python2.7/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='zoo.studio.ml', port=5000): Max retries exceeded with url: /api/get_user_experiments (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7ffb63a87850>: Failed to establish a new connection: [Errno -2] Name or service not known',))

so it can not connect to the server zoo.studio.ml.

Is your server down? What are the possibilities to emulate this server locally or somewhere else?

I wonder if I did anything wrong with the configuration itself.

karlmutch commented 5 years ago

zoo.studio.ml is current out of service due to google account transition issues. If you wish to run experiments in the meantime you can make use of AWS S3, Google Cloud storage, locally hosted minio or other S3 compatible storage infrastructure. (edited)

The same account issue has also caused the document hosting name server to not work as well. Documentation can be accessed from https://studioml.readthedocs.io/en/latest/.

There is a mitigation if you want to run completely locally.

studioml is designed to capture experiment results within shared infrastructure within a team environment so is not explicitly designed to store everything locally. However, it is possible to leverage the S3 support to store your artifacts and results on a local machine by using a server such as minio, https://minio.io/. Minio can be run locally on your laptop and then you can point your database and storage sections of the config file at it.

Patechoc commented 5 years ago

Thanks ;) I understand the point of running StudioML in a cloud/distributed infrastructure, but this is great and I believe necessary to be able to run locally: for testing but also everytime you don't get access to internet and still want to capture the metadata of these experiments.

Maybe not the right place to ask, but how do you think StudioML might compare to similar (younger) tools like the following ones?

Databricks's MLflow: https://github.com/mlflow/mlflow
Uber's PyML: https://eng.uber.com/michelangelo-pyml/

Is there a similar community of developers helping you develop StudioML?

karlmutch commented 5 years ago

Is there a similar community of developers helping you develop StudioML?

Not at this moment however things might well be on changing on this front in the 3 month time frame.

StudioML for our purposes (Sentient and its partners) however still remains focused on evolutionary learning use cases and a mix of commercial deployment and research efforts.

you don't get access to internet and still want to capture the metadata of these experiments

StudioML depends more on distributed components than the cloud. One option if you wish to run storage services, the database and storage sections of the config, in a laptop context is to deploy a rabbitMQ and minio server as part of a microk8s cluster. I have been doing this locally on a new branch https://github.com/karlmutch/studio-go-runner/blob/feature/183_json_metadata/test_k8s_local.yaml. This file will appear in the master branch at some point but gives an idea of how Kubernetes deployment models can be used for localized deployments.

For some of the differences, that matter to our situation, between other projects and StudioML I cant speak for other projects. But I maybe can point out some of why we use StudioML currently.

. We extensively use but explicitly/intentionally do not tightly couple our architecture to Kubernetes . We have the freedom to run virtualized python (virtualenv), and a variety of container runtimes cloud providers cannot. We have both untrusted and trusted workloads and can cut corners . We use a broader range of learning strategies, including evolutionary learning, than metadata models employed by other frameworks allow . We have cost constraints that cannot match the cloud privilege others have so have to be creative to maintain state of the art results that match or exceed those of large players in the cloud space.

In answering questions related to for example running workloads locally etc using things like microk8s and minio my approach as been to try to address functional requirements outside of the code and in the architecture/deployment. This has the downside that for use cases outside of the initial use-case scope knowledge is needed than otherwise might be a single button click install, and run. But it has the upside in that StudioML can be more broadly deployed without adding code for individual features. We have not altered StudioML for sometime, however for our projects we have been extensively using and abusing it, adding value around it without needing to 'prod the code'.

karlmutch commented 5 years ago

Inactivity, closing

studioml / studio

Not able to connect to the server #360