Closed kxepal closed 10 years ago
Rationale.
How db.changes
method is supposed to be used:
params = {
'params': {
'feed': 'continuous',
'heartbeat': 30000,
'since': 0,
'timeout': 10000
}
}
changes = db.changes(**params)
assert changes.status_code == 200
for line in changes.iter_lines():
if not line: # skip heartbeats
continue
line = line.decode('utf-8')
print(line)
That's cool, but these are a lot of useless code. Yes, changes feed has line-by-line output and I wanted to iterate over all these lines. Yes, changes feed carries JSON data and I need to decode it to Python objects to work with. Yes, I'd like to skip heartbeats since they aren't interesting to me, but are need to keep connection alive. + decoding bits, since for Python 3.x response contains binary string, while stdlib json decodes only unicode strings.
This PR makes code above looks more simple, intuitive and Pythonic:
params = {
'params': {
'feed': 'continuous',
'heartbeat': 30000,
'since': 0,
'timeout': 10000
}
}
for obj in db.changes(**params):
process(obj) # do work with decoded JSON
Also, due to default chunk size 512 bytes current implementation of naive processing changes feed doesn't emits all the events since there couldn't be \n
character to split the feed. PR fixes that issue too, so you don't have to worry about unprocessed events.
Looks good, my main comment is that there should at least be the option to pass the heartbeat on instead of continuing internally and effectively blocking. Some user code may use these heartbeats to check variables, cancel the feed, etc.
there should at least be the option to pass the heartbeat on instead of continuing internally and effectively blocking
I see the case when proxying heartbeats could make a sense. I this case which value should represent Heartbeat object? I feel that None
is quite good candidate for.
You mean that obj
in the iteration would be None
when it is a heartbeat? Yeah, I think that's good and is consistent with the idea that no JSON object has been returned in the heartbeat.
This behavior is then like a blocking socket with a timeout, where the user needs to check if data was really returned.
@mgmarino done!
@garbados I removed TODO notice. Actually, I cannot reuse Index.iter code since it skips empty lines (which I have to use as heartbeat detector) and also I have to set custom chunk_size on iter_lines call for continuous changes feed.
Sounds good. Merging...
This change allows to process changes feed events without any additional magic work as iter_lines, parsing json, handling heartbeats - all these routines are wrapped inside changes() method.
Also, for continuous changes feed it doesn't holds anymore the latest change events in memory till request timeout or new change events.