pystorm / streamparse

Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
http://streamparse.readthedocs.io/
Apache License 2.0
1.5k stars 218 forks source link

Create a Python3 virtualenv on Storm #349

Closed tanaysoni closed 7 years ago

tanaysoni commented 7 years ago

I'm using Python3 to sparse submit a topology using Fabric3 in the topology requirements file, but still a Python2 virtualenv gets created.

I'm submitting the topology locally, if that matters.

dan-blanchard commented 7 years ago

streamparse uses virtualenv that's in the PATH of the user you're SSHing into the servers as. If you would like to use Python 3, then install Python 3 on the servers and put it at the front of your PATH.

mizilmc commented 6 years ago

I also have the same problem. I am using a ubuntu worker node. For Ubuntu, both the python2 and python3 executables are installed in / usr / bin. How do I set the pyhton3 folder to the beginning of my PATH?

dan-blanchard commented 6 years ago

There is actually an easy way to do this as of the most recent release. You can specify virtualenv flags (including -p to set the python path`) in your config.json as explained here.

bilibvivo commented 6 years ago

@dan-blanchard I used "virtualenv_flags": " -p /usr/local/bin/python3",, but do not work.

dan-blanchard commented 6 years ago

@dume2007 What version of Streamparse? What version of Storm? Where are you specifying it in config.json? At Parse.ly we're using that on a per-env basis and that works for us.

Here is an anonymized version of our working config.json:

{
    "serializer": "json",
    "topology_specs": "topologies/",
    "virtualenv_specs": "virtualenvs/",
    "envs": {
        "storm2": {
            "user": "fake",
            "nimbus": "fake.example.com:6627",
            "log": {
                "level": "info"
            },
            "virtualenv_root": "/data/virtualenvs",
            "ui.port": 8081,
            "options": {
                "supervisor.worker.timeout.secs": 600,
                "topology.message.timeout.secs" : 60,
                "topology.max.spout.pending" : 500,
                "virtualenv_flags": "-p /usr/local/lib/python3.6.0/bin/python3"
            }
        },
        "storm-beta": {
            "user": "fake",
            "nimbus": "fake-beta.example.com:6627",
            "log": {
                "level": "info"
            },
            "virtualenv_root": "/data/virtualenvs",
            "ui.port": 8081,
            "options": {
                "supervisor.worker.timeout.secs": 600,
                "topology.message.timeout.secs" : 60,
                "topology.max.spout.pending" : 500,
                "num.stat.buckets": 1,
                "virtualenv_flags": "-p /usr/local/lib/python3.6.0/bin/python3"
            }
        },
        "docker": {
            "user": "fake",
            "nimbus": "local.example.com",
            "workers": [
                "local.example.com"
            ],
            "log": {
                "level": "info"
            },
            "virtualenv_root": "/data/virtualenvs"
        }
    }
}
bilibvivo commented 6 years ago

@dan-blanchard Thanks, I found the problem. I lost the options key.

In this document guide, I think options key should be supplemented, otherwise it will be misled. http://streamparse.readthedocs.io/en/latest/quickstart.html#disabling-configuring-virtualenv-creation

dnk8n commented 6 years ago

This part is confusing and could do with some clarifications in the documentation. Mainly because it is counter-intuitive for "virtualenv_root" key to not be under the "options" key, and for "virtualenv_flags" to have to be under the "options" key.

If "virtualenv_flags" (or any other key) is found in the wrong location in config.json, at least a warning should be shown.

I am happy to do the work to rectify that if you agree with me.