Open jfischer opened 12 years ago
Were you trying: examples/file_index_noquery.json
Unfortunately, I cannot reproduce that on my machine. Maybe we can debug this on our phone conversation today.
Does zeromq actually queue messages? If so, can we get queue size statistics?
Yes, it does. But unfortunately we cannot access the queue data. Right now, we maintain counters on each block which keep track of how many requests have been served. This data is sent to the Master and it can help us with parallelization.
The other two design choices could work. But in some cases, the producers cannot back off as they might be getting data from external sources. The solution in this case would be to have a consumer shard and add extra consumers when the traffic goes high.
I tried running all the blocks of the file_index.json example on my laptop. It drove up the cpu and memory usage to the point where the UI froze and I had to hard-reset my laptop. This will also be an issue in distributed scenarios where multiple producers (e.g. a crawler) feed into one consumer. The solution is to add throttling to push/pull connections.
We need to discuss the design options. Here's a few ideas: