Closed irvinlim closed 6 years ago
I added some logging in process.py
after line 300:
while True:
logger.info('process=%s,conn=%s,master=%s', self, conn, self._master)
if not conn and self._master:
conn = Connection(self._master, self)
logger.info('process=%s, conn=%s', self, conn)
Notice how the self._master
variable is always None
for MesosOperatorMasterDriver:
framework_1 | 2018-07-27 18:04:40.172|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>,conn=None,master=None
framework_1 | 2018-07-27 18:04:40.177|INFO|process=<pymesos.operator_v1.MesosOperatorMasterDriver object at 0x7fbf17bea650>,conn=None,master=None
framework_1 | 2018-07-27 18:04:40.183|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>,conn=None,master=mesos-master:5050
framework_1 | 2018-07-27 18:04:40.187|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>, conn=<pymesos.process.Connection object at 0x7fbf17beaf10>
mesos-master_1 | I0727 10:04:40.189290 15 http.cpp:1185] HTTP POST for /master/api/v1/scheduler from 172.26.0.5:49914
mesos-master_1 | I0727 10:04:40.189538 15 master.cpp:2610] Received subscription request for HTTP framework 'scheduler'
mesos-master_1 | I0727 10:04:40.189677 15 master.cpp:2745] Subscribing framework 'scheduler' with checkpointing disabled and capabilities [ TASK_KILLING_STATE, GPU_RESOURCES ]
mesos-master_1 | I0727 10:04:40.189718 15 master.cpp:7195] Updating framework 0bc93f09-207f-4528-9b20-d2f0527286c7-0000 (scheduler) with roles { } suppressed
framework_1 | 2018-07-27 18:04:40.188|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>,conn=<pymesos.process.Connection object at 0x7fbf17beaf10>,master=mesos-master:5050
framework_1 | 2018-07-27 18:04:40.190|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>,conn=<pymesos.process.Connection object at 0x7fbf17beaf10>,master=mesos-master:5050
framework_1 | 2018-07-27 18:04:40.195|INFO|scheduler_re_registered|master={'hostname': 'mesos-master', 'port': 5050, 'version': u'1.5.0'}
framework_1 | 2018-07-27 18:04:40.195|INFO|process=<pymesos.scheduler.MesosSchedulerDriver object at 0x7fbf17bea450>,conn=<pymesos.process.Connection object at 0x7fbf17beaf10>,master=mesos-master:5050
I reproduce this problem with operator_v1.py
in examples. I notice that _send
is failed only if try to connect mesos slave directly. This's what you mean?
See https://github.com/irvinlim/pymesos-0.3.4-bugrepro for what I mean. The operator is not able to subscribe to any events when using version 0.3.4.
Using the example code, I connected to the master directly with python operator_v1.py mesos-master:5050
.
Thanks very for your bug report repo.
Thank you for the fix! π
Can I get you to publish a new version of the package when you can? Thank you!
@ariesdevil ε€θ°’δΊπ
Using the latest version of PyMesos 0.3.4, I cannot receive any events using the operator API. It looks like the SUBSCRIBE request is not sent, from my initial investigation.
On PyMesos 0.3.3, after calling
driver.start()
, I get a log message similar to the following:However, on 0.3.4, no such log is present, and any task updates or agent updates will not notify the operator.
I believe this can be reproduced with the basic examples provided. Falling back to PyMesos 0.3.3 fixes this issue, which seems like #98 introduced a regression.