pystorm / streamparse

Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
http://streamparse.readthedocs.io/
Apache License 2.0
1.5k stars 218 forks source link

Using Streamparse in HDP 2.3.2 - Storm version does not get parsed #356

Closed srujun closed 7 years ago

srujun commented 7 years ago

Hi,

I've been trying to set up Streamparse 3.4.0 within HDP 2.3.2, using Python 2.7.8. I was having issues running submit, for which it gave me an error in the local_storm_version() function within the streamparse/util.py file. Upon testing, I found that the list returned by the Regex pattern find is empty.

pattern = r'^Storm ([0-9.]+)'
return parse_version(re.findall(pattern, res.stdout, flags=re.MULTILINE)[0])

The string returned by the "storm version" command on HDP 2.3.2 is:

Running: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91.x86_64/bin/java -client ..(omitted text).. supervisor/conf backtype.storm.utils.VersionInfo
Storm 0.10.0.2.3.2.0-2950
URL git@github.com:hortonworks/storm.git -r f417df29609d4477b485b4cb01c6e496f0715bab
Branch (HEAD detached at f417df2)
Compiled by jenkins on 2015-09-30T20:17Z
From source with checksum 4a2753c9efa4bb8463ea768bf1561412

The important part is in bold. The funny thing is that between the huge line before it and the actual line with the version, there is no newline character. And on top of that, HDP Storm uses hyphens in its version numbers. These two issues cause the Python code to not correctly parse the version number. The fix I was able to come up with to solve this is

pattern = r'Storm ([0-9.-]+)'

But I have not tested this for other Storm deployments. I need to automate the setup of Streamparse for my application, so I would like this fix to be in the actual Streamparse sources. Is it worth submitting a pull-request or is there another way I can get around this?

Thanks!