Closed wenhaocs closed 12 months ago
Based on experiment from @HarrisChu
This is because the LB has multiple connections to the server side. For every session we create at client side, we will call use space
. However, if the LB has more connections than the number of sessions, it is possible some connections is not associated with the session calling use graph
. On the other hand, HTTP server is usually stateless. On request, it may or may not reuse the previous connection with which use space
was executed.
client -> :8080 -> :9119 and :9449
when
and then it would cause Space was not chosen
error.
和sc确认了,使用Envoy后,同一个client的不同Http stream,可能会hit到不同的graphd。所以每次创建新的session后,需要立马写入meta。添加一个开关来切换
确定要通过修改内核来实现,即在每次创建session,use space直接上报给meta,是否通过flag还是只做在sc分支由研发来决定。这个方案也会有些不足的地方,比如1)每次session claim的时候graphd会上报更多的信息给到meta;2)session idle timeout带来的重试问题等。 另外,sticky session的方式即使sc的LB支持,我们会需要修改fbthrift http2的实现来支持cookie,另外这个方案在HPA下也会有问题,不能完全满足需求。
Please check the FAQ documentation before raising an issue
Describe the bug (required) When using HTTP/2 mode, there will be small % of requests failing due to error "query failed with error code -1009 and error message SemanticError: Space was not chosen."
Your Environments (required) AWS with LB
How To Reproduce(required)
Steps to reproduce the behavior:
Expected behavior
Additional context