Currently when launching a supervisor the AM will overwrite the original storm.yaml file so that nimbus.host can be output. This allows the supervisors to find nimbus to download jars and config files from nimbus. We need to stop overwriting this file because it causes the distributed cache to think the file has changed and fail to launch the supervisor. We should write it out once for the supervisors once nimbus comes up. After that we should not touch it again.
Ideally long term we should have the nodes discover nimbus through zk, so that if there is a failover to a new nimbus server the supervisors continue to function properly. We may also want to replace downloading the files from nimbus, and replace it with downloading the files from HDFS. This would make it so the supervisors do not need the address of nimbus at all. But both of these are longer term and require changes in storm itself.
Currently when launching a supervisor the AM will overwrite the original storm.yaml file so that nimbus.host can be output. This allows the supervisors to find nimbus to download jars and config files from nimbus. We need to stop overwriting this file because it causes the distributed cache to think the file has changed and fail to launch the supervisor. We should write it out once for the supervisors once nimbus comes up. After that we should not touch it again.
Ideally long term we should have the nodes discover nimbus through zk, so that if there is a failover to a new nimbus server the supervisors continue to function properly. We may also want to replace downloading the files from nimbus, and replace it with downloading the files from HDFS. This would make it so the supervisors do not need the address of nimbus at all. But both of these are longer term and require changes in storm itself.