Open jolasman opened 5 years ago
The error happens because headerTable has None
value returned in the createFPtree
method,
for k in list(headerTable.keys()):
if headerTable[k] < minSup:
del (headerTable[k]) # 删除不满足最小支持度的元素
freqItemSet = set(headerTable.keys()) # 满足最小支持度的频繁项集
if len(freqItemSet) == 0:
return None, None
the headerTable[k] value was all deleted and finally headerTable return None. The author set the n = 20000 in the demo, maybe it's too big for your dataset, and I decreased the n value to make this demo works at my dataset.
想请作者解释一下,在支持度计数为100000的情况下,如何在mac上用13秒跑完(你的中文博客如是写道),我将你的代码改为python3.7后,在8代i7,内存16g下也依然跑了十几分钟
Hi! :) I changed some of the code to use with Python 3,, however, I have some issues. I cannot find a library with the FP-growth algorithm that works. I tried the pyspark one and the FP-growth. In the pyspark one, I end up with spark's connection errors after some runs. It was working in the beginning, but then it blew up. The second one cannot handle my dataset due to memory problems.
Btw, after I changed some dict problems with iteritems() and has_key(), the nimeFPtree function gives me an error that I do not know what it is:
Any thoughts?
Thanks in advance