Closed oneway3124 closed 3 years ago
I have converted three dataset by using the dataset_converters python scripts. The preprocessed dataset by me is as follows, redd.hdf5, ukdale, ampds, But the dataport and some other datasets cannot be preprocessed by us easily. I greatly appreciate all your comments, help and suggestions about the dataset.
The successful train and test of REDD snapshot is as the figure shows.
@oneway3124 Can you share the preprocessed redd dataset with me? I am working on a research and I noticed in some work they use redd dataset that preprocessed in some way. I tried to reproduce their results with the redd dataset (that is in nilmtk format) but always fail.
Also, there is a big gap in the data of house 1 main while in that same period there exist data for the aplliances. Similar gaps can be found in the main power in the other houses.
If you want to get the preprocessed redd dataset, I can give you via the email or pan.baidu.com?
Thanks a lot! Best Regards, Wireless Sensor Networks, School of Computer Science, Sichuan University.
Wang Wei(王伟), |mobile: +86-159-0810-6107 | email: wang.david.wei@stu.scu.edu.cn, 190025935@qq.com
I'll try to look into this but unfortunately we don't own most of the datasets, we can't just rehost them unless they have a clear license (lots of them don't). I opened https://github.com/nilmtk/nilmtk/issues/909 to track and discuss this in 2021.
But the dataport...
See https://github.com/nilmtk/nilmtk/issues/873 (and https://github.com/Pecan-Street/DataPort-Examples/issues/1 -- no replies so far) -- we probably won't support Dataport in the future, at least not fully. Besides a small selection of CSV that is freely available under some terms, it's a commercial service that changes without notice. It's more fair to ask them to support NILMTK instead. NILMTK is a maintained only by volunteers.
It's not hard to find discussions on how copyright issues are a real threat nowadays. I recommend anyone reading on that before hosting someone else's files.
As a final note:
Or else, there is still few researcher who can use the nilmtk and nilmtk-contrib for research. And the phenomenon will continue to exist, "few pubilication presenting algorithmic contributions within the field went on to contribute implementations back to the toolkit".
I disagree that's the main reason. Most people that can't grab a single dataset to work won't be able to contribute significant features unless they're on the topic for the long term. The issue is more of a lack of culture of collaboration, which is something severely lacking in the fields like this. Note that NILMTK did and does have collaboration over the years, but that's a tiny set of individual compared to the whole set of users. But I digress...
Hi Sir,
whether or not the dataset is handled by the dataset_converter to the hdf5 files and is also supplied for the research community?
Or else, there is still few researcher who can use the nilmtk and nilmtk-contrib for research. And the phenomenon will continue to exist, "few pubilication presenting algorithmic contributions within the field went on to contribute implementations back to the toolkit".