Closed tucanae47 closed 6 years ago
hey, one general thing I noticed: some doc strings have changed. Can you check your changes and update + add the doc strings accordingly?
Thank you guys for the review, I'm pushing a new commit with the suggested changes. I agree with mostly everything, so I updated the implementation accordingly. Also, I updated the missing docstrings.
Hey, did you install pre-commit
? in transport.py
some checks failed for me.
no i didn't sorry, just installed and fixed accordingly.
This PR looks good but I have one concern though,
self.get_connection()
decides if nodes are available only if there are retires left for that node i.e.
for node in self.retries:
if self.retries[node] > 0:
return True
return False
and the next node is picked if the existing node has been put into delay i.e.
if current_time_ms > picked_time:
node = connections[self.picked]["node"]
return node
else:
self.next_node(connections)
node = connections[self.picked]["node"]
return node
So, for a network of size, say 10 nodes, the max_tries
for each node would be 40
. Since the available nodes are only decided based on their available retries a net total of 400
retries will be attempted for a given failing request before the driver gives up and raises an exception.
Two questions:
/cc @vrde
Travis tests are failing as stated by @ttmc on this comment.,i believe will pass with that PR merged
@tucanae47 I fixed the tests. You can just merge master into your branch and it should work :)
so the last exception raised will have more priority over the timeout, and the goal is to inform (give a hint) the user what is wrong with the last node used
@tucanae47 that's a good point! What do you think about collecting all the errors that occurred? It should be even more helpful.
Also, when the global timeout expires, I would append a custom timeout error to the list instead of mimicking an actual HTTP error cause in this case it's not an HTTP error obviously.
Hey guys, i pushed new changes with last review suggestions. Main change besides review is related with @ldmberman suggestions regarding nodes normalization, so i (propose) implicitly removed headers parameter from Transport class, and let normalization with headers to happen in utils, so we only deal with one kind of format to use internally which is:
[{'endpoint': 'https://test.bigchaindb.com',
'headers': {'app_id': 'your_app_id',
'app_key': 'your_app_key'}},
{'endpoint': 'https://test2.bigchaindb.com',
'headers': {'app_id': 'your_app_id',
'app_key': 'your_app_key',
'extra_header': 'extra value'}}...]
@tucanae47 thank you for the update! The only part I would like to further refine I described here. What are your thoughts on it?
If you change something again, could you merge master into your branch again? We had some issues with codecov and now it is working again :)
Merging #430 into master will decrease coverage by
0.42%
. The diff coverage is98.41%
.
@@ Coverage Diff @@
## master #430 +/- ##
==========================================
- Coverage 100% 99.57% -0.43%
==========================================
Files 5 5
Lines 192 233 +41
==========================================
+ Hits 192 232 +40
- Misses 0 1 +1
@ldmberman please have a look at the new push, basically im collecting a dict of errors raised per node:
exceptions = {}
while time_left.total_seconds() > 0:
try:
connection = self.get_connection(time_left)
response = connection.request(..)
return response.data
except TransportError as err:
end = time()
time_left -= timedelta(seconds=end - start)
self.pool.fail_node()
start = time()
exceptions[err.url] = {"code": err.status_code, "info": err.info}
except BaseException as err:
...
exc_cls = HTTP_EXCEPTIONS.get(504, TransportError)
if len(exceptions):
raise exc_cls(504, "Gateway Timeout", exceptions)
...
as of the actual error code and exception to return i think is good enough to use aws http load balancers 504 timeout error
i think is good enough to use aws http load balancers 504 timeout error
It's appropriate if you implement an HTTP server. Here you might make the user think there is actually an HTTP server somewhere which closed the connection after a timeout.
What do you think about defining a custom exception and raising it there?
class TimeoutException(BigchaindbException):
pass
Hi, just merged master and it seems codecov is failing, not sure how to fix it.
Hey, the codecov can be ignored for now. It is just complaining about one line :)
Description
Added support for BEP/14 to improve Drivers Reliability Added new test set for mutiple nodes
The approach implemented is the roundrobin connection strategy previously discussed on bep14 issue
How to QA
For running tests with docker i used suggested approach on contributing guideline command with an addition to test multiple nodes: