Closed viirya closed 11 years ago
Sorry that it took us so long to get this up. Viirya, could you please take a look at
https://github.com/yahoo/storm-yarn/wiki/Contributing
and send in a signed copy of the CLA. Then we will take a look at your contribution.
Revans2, I will send the CLA later. Thanks.
Over all the changes look good, with just some minor rework. Thanks.
Andy the storm.zip.path points to where in HDFS the zip file is placed. The storm.home.in.zip points to where inside the zip file storm home is. We probably want to document them better at some point but I don't think it is a blocker for pulling this patch in.
Is there any reason that we need configuration control for "where inside the zip file storm home is". If this is caused by createtarball.sh, let's create an issue and have it fixed soon.
Because it is needed to set the storm home in classpath and app master's command, if it is incorrect the system can not work. As I tried to run storm-yarn at first, I used storm zip built from storm sources and found that storm-yarn can not work. Then I noticed that the storm home included in the zip file will contain storm version e.g. storm-0.9.0-wip19. If you build storm from its sources and directly take it to work with storm-yarn, the storm home setting will be wrong.
Of course you can ask users to uncompress their zip files and rename the storm home to 'storm' as you can ask them to put their storm zip files only at /lib/storm/storm.zip. But I think as you already have used storm version to find storm zip file location and provide a configuration for it, why no do the same thing on the storm home in zip file? Then by default it will automatically set up storm home using storm version variable, and users can also set it by configure file.
Andy this is fixing a real issue. I don't want to fix it createtarball.sh because when your patch goes in and storm can run with netty instead of zeromq there will be many people that do not even run createtarball.sh. They will just take the production zip file. I think this is preferable because it gives us flexibility in the future, even if the packaging changes.
I think that we need to enforce the structure of storm.zip. The flexibility of storm.zip internals will only increase confusion. The constraint could be that the given zip file MUST have a top level entry "storm". \ could be empty, or version # etc. With such a constraint, we should be able to find out "storm.home.in.zip" value by java.util.zip.ZipInputStream and ZipEntry classes.
Also, as an separated issue, we should just eliminate createtarball.sh, and give instruction to use standard storm tools.
That is almost the default behavior. The config just provides more flexibility if needed.
In fact as revans2 said that what Andy mentioned is almost the default behavior in previous commits. The difference is if the configuration or not. I remove the configuration and let it to find storm home dir in zip file automatically by using java.util.zip.ZipInputStream and ZipEntry classes as Andy's suggestion.
The config provides more flexibility and maybe increase confusion. That depends on you. For me I think it is no problem to provide a flexible way to change the storm home dir by a config. This pull request just aims to fix the problem caused by the fixed storm home dir ("storm") in the zip file.
Looks nice. +1 for merge.
The new changes look good, but we still need to address createConfigurationFileInFs. We either need to revert it or fix it properly. Because as it is now we may be able to launch multiple workers those workers will never be able to launch anything.
Is the problem caused by only one supervisor is allowed to run on a host and other supervisors launched later on the same host will be useless?
If it is, should we record the hosts we already launched supervisors and skip those hosts when acquire containers?
This is a short term limitation. We will enable multiple supervisors to be launched very soon.
Andy Feng
Sent from my iPhone
On Jul 3, 2013, at 1:58 AM, Liang-Chi Hsieh notifications@github.com wrote:
Because only one supervisor is allowed to run at a host and other supervisors launched later will be useless, should we record the hosts we already launched supervisors and skip those hosts when acquire containers?
— Reply to this email directly or view it on GitHub.
As Andy indicated we are working on a patch for storm to allow the workers, and other daemons, to have ephemeral ports. I should be posting that patch to storm shortly.
So do we still need to address the problem mentioned by revans2? Sounds like when you submit the patch to allow multiple supervisors, it will not be a problem anymore.
We still need to revert the changes in Utils.createConfigurationFileInFs or else none of the supervisors will be able to download jars/etc from nimbus.
The changes in createConfigurationFileInFs are reverted. However, supervisors now face same IOException problem as config file is overwrote.
The changes look good to me. Thanks for your help on this, and sorry it has taken so long to get going.
+1
Two features included in this pull request: