Closed dnomadb closed 9 years ago
@dnomadb say you have an iterator over Python dicts like {"n0n3n1s3s2s2s1s3n0n1n2n": {"value": 1.0142107312820954}}
... just do
for item in items:
click.echo(json.dumps(item))
click.echo()
adds a LF just like Python's print
does. Don't echo the whole thing, just echo record-by-record or feature-by-feature.
Oh my that is so simple. Thanks @sgillies
One more thought @sgillies - let's say I need to do work to this same particular set of items, eg:
def doThis(thing):
return doSomeWork(thing)
items = list(doThis(item) for item in items)
for item in items:
click.echo(json.dumps(item))
Would doing it like this be faster overall? As in, the echo happens as each task completes, rather than per item after all are complete, eg:
for item in items:
click.echo(json.dumps(doThis(item)))
@dnomadb yes, that's faster. Cut out the middle (list) man!
This is all integrated.
In order to pipe (streaming) to a tool that will asynchronously update a database, we need to print this format to stdout:
That is, line separated objects. Right now, I:
json.dumps
on each object within a list: https://github.com/mapbox/make-surface/blob/point-sampler/makesurface/scripts/fill_facets.py#L59-L65click.echo
) this list with a newline char ('\n'.join(theList)
): https://github.com/mapbox/make-surface/blob/point-sampler/makesurface/scripts/fill_facets.py#L129This seems to perform fine alone, but when piped to a streaming db update script, is very. slow.
Bottom line
We need to print this line delimited set of json objects out in a way that is suitable for streaming.
cc: @ian29 @sgillies @rclark