huawei-noah / streamDM

Stream Data Mining Library for Spark Streaming
http://streamdm.noahlab.com.hk/
Apache License 2.0
492 stars 147 forks source link

Record format with Kafka #92

Closed zhangjinrui2718 closed 6 years ago

zhangjinrui2718 commented 6 years ago

Infrastructure details Java Version:1.8 Scala Version:2.11 Spark version:2.2 OS version:win7+centos6.7 Cluster mode

I am using kafka with StreamDM suppose there is a data set for traning with format(e.g. csv) as following:

1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065 1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050 1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185

the first element is label and the others could be assembled to feature

so,what is the format of these records if i use producer to send them to topic?

e.g. if i have a sample with label = '1' feature='14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065'

how could i construct ProducerRecord with label and feature?

hmgomes commented 6 years ago

Dear @zhangjinrui2718, I cannot test it right now as I am traveling abroad, but I can take a look at it next week.

Dear @JianfengQian, Perhaps you are available to assist @zhangjinrui2718 with his question? I think you have more practice with the KafkaReader than me

Best Regards, Heitor

zhangjinrui2718 commented 6 years ago

thanks for your response.

In online learning field,there are not to many opensouce software or projects. I think StreamDM is an excellent project,the need for online learning will be greater and greater.

Will u laboratory update StreamDM in some time?

hmgomes commented 6 years ago

Dear @zhangjinrui2718,

Yes, we are currently working on updating and extending StreamDM Expect some updates in the upcoming months :)

Best Regards, Heitor

hmgomes commented 6 years ago

Dear @zhangjinrui2718,

I am sorry for taking too long to provide you with an answer. I was going to update the KafkaReader class, but due to higher priority aspects of the project I end up not getting to work on that yet. So far, the best I can assist you with is to direct you to the current KafkaReader version, specifically you will need to setup your instance such that it complies with the code in getExamples(). Please let me know if that works for you

Best Regards, Heitor