mozilla-services / heka

DEPRECATED: Data collection and processing made easy.
http://hekad.readthedocs.org/
Other
3.4k stars 531 forks source link

Some questions about sandbox #1976

Closed SeeWei1985 closed 7 years ago

SeeWei1985 commented 7 years ago

HI, when I use lua_bloomfilter https://github.com/loveghost/lua_bloomfilter to compute unique visitors for some website.

Follows are codes written in filter of heka

items_uv_table={} items_uv_count_table={} function process_message(){ .... if not items_uv_table[tabKey]:contain(user_id) then items_uv_table[tabKey]:put(user_id) if not items_uv_count_table[tabKey] then items_uv_count_table[tabKey]=1 else items_uv_count_table[tabKey]=items_uv_count_table[tabKey]+1 end end ... }

tabKey and user_id are both a string. However, the program is stucked at items_uv_table[tabKey]:put(user_id), thus, I delete this code segment and the program is executed smoothly. Also, I try to write this code segment as items_uv_table[tabKey]:put("ab") while the program is executed smoothly too. However, the parameter length of items_uv_table[tabKey]:put("xxx") must be less than 3, otherwise the program is stucked.

At last, the above codes are writen in lua environment and executed smoothly even though the user_id is "f9adc670c09e491b82d177d8da9fd9a4". Codes as follws

`require 'string' require 'table' require 'math' require 'os'

local BloomFilter = require("bloomfilter")

items_uv_table={} items_uv_count_table={}

function f(user_id,tabKey)

if not items_uv_table[tabKey] then items_uv_table[tabKey]=BloomFilter:new(2000000) end

if not items_uv_table[tabKey]:contain(user_id) then items_uv_table[tabKey]:put(user_id) if not items_uv_count_table[tabKey] then items_uv_count_table[tabKey]=1 else items_uv_count_table[tabKey]=items_uv_count_table[tabKey]+1 end end

end

for i=1,4 do f("f9adc670c09e491b82d177d8da9fd9a4","t1") f("f9adc670c09e491b82d177d8da9fd9a4","t1") f("f9adc670c09e491b82d477d8da9fd9a4","t1") end

print(items_uv_count_table["t1"])`