How to do with Oregon Scientific and new output system ?

enavarro222 commented 8 years ago

I wanted to migrate oregon scientific decoder to new output format proposed by @eras.

However it's not so trivial as the same "device" (r_device oregon_scientific) is used for different kinds of sensors. Each sensor producing different data.

I list the following sensors (with currently associated data):

* THGR122N THGR968
 - channel
 - rc Rolling Code
 - temperature_C
 - humidity

* BHTR968 (Indoor)
 - temperature_C
 - humidity
 - presure
 - confort
 - forecast

* RGR968
 - rain_rate
 - total_rain

* THR228N
 - channel
 - temp_c

* THN132N
 - channel
 - rc Rolling Code
 - battery
 - temp_c

* RTGN318
 - channel
 - rc Rolling Code
 - battery
 - temp_c
 - humidity

* THGR810
 - channel
 - temp_c
 - humidity

* UVN800
 - channel
 - uvidx

* WGR800
 - wind speed
 - wind direcetion

* Owl CM160
 - rawAmp

* Owl CM180
 - id
 - power
 - total energy

(I'm quite surprised that some have no channel nor rolling code...) Anyway, should we split this in 11 devices declarations ?

The alternative is to declare that oregon_scientific produce the union of this 11 sensors data... and then for each sensor only fill some of the fields. It may be quite ugly, especially in csv output.

What do you think about that ? Other ideas ?

merbanan commented 8 years ago

With the exception of the csv data, what is the problem with creating 11 different objects ?

enavarro222 commented 8 years ago

Do you mean keeping one "device" for all Oregon Scientific sensors ? Indeed the problem is for csv output.

I can test this way and see if other issues appears...

enavarro222 commented 8 years ago

Ok I did like that for two sensors (the ones I have). see https://github.com/enavarro222/rtl_433

It seems to work well. Except, as expected, for csv output. It is populated with many empty useless cols: ''' $ ./src/rtl_433 -F csv -R 12 -r ../rtl_433_tests/tests/oregon_scientific/05/THN132N_180_ch3.data 2>/dev/null time,model,rc,channel,battery,temperature_C,humidity,presure,confort,forecast,rain_rate,total_rain,uvidx,wind_speed,wind_direction,rawAmp,power,total_power 2015-11-28 23:47:24,THN132N,231,3,OK,18.000000,,,,,,,,,,,, 2015-11-28 23:47:24,THN132N,231,3,OK,18.000000,,,,,,,,,,,,

$ ./src/rtl_433 -F csv -R 12 -r ../rtl_433_tests/tests/oregon_scientific/05/THGR122N_188_54_ch1.data 2>/dev/null time,model,rc,channel,battery,temperature_C,humidity,presure,confort,forecast,rain_rate,total_rain,uvidx,wind_speed,wind_direction,rawAmp,power,total_power 2015-11-28 23:47:58,THGR122N,248,1,,18.799999,54,,,,,,,,,,, 2015-11-28 23:47:58,THGR122N,248,1,,18.799999,54,,,,,,,,,,,

'''

I can continue this way, however I do not have any of the other sensors and there is few sample on rtl_433_test to check that everything is ok...

merbanan commented 8 years ago

We need to fix csv output also, I think that we should pass the correct fields table via data_make or data_acquired_handler.

merbanan commented 8 years ago

And it's fine to only change the sensors you have signals for.

merbanan commented 8 years ago

The code you wrote looks good. But we need to mode the time stamp out of the decoders.

enavarro222 commented 8 years ago

To fix the csv, I guess it's no so easy : all the -possible- cols have to be know before any signal is catch.

merbanan commented 8 years ago

But one knows it when you run data_make or data_acquired_handler in the code. Just add it there where it suits better, or if you can't be bothered with it just do it like you do now.

eras commented 8 years ago

I think one solution would be to allow user to have 'sub devices'. It might work like this:

% rtl_433 -R 42.1 -R 42.2

This could work by passing the enabled sub devices to the device handler's initialization code. The device would remember the enabled subdevices. And then add a new field to the device structure called (say) .field_constructor that would return the list of used columns. In this case the field .fields would be NULL.

A practical simpler short-term alternative: allow user to enter the fields manually. Actually the code is written with this option in mind, so that if a manually entered list of fields is entered, the order of the fields will be preserved.

enavarro222 commented 8 years ago

Subdevice may be a good idea ! It make sense to have a device/sensor list organise in a two level tree. Especially if this list get larger... It also permit to limit outputs to one kind of sensors... (for ex. if you don't want to process neighbourhood "same protocol/brand but different model" sensor).

However it's quite a big change...

Another solution is to adapt the output cols to the incoming msg. In other words, to add missing cols as soon as we need it (i.e. when data_make or data_acquired_handler are called). This as some disadvantages:

no more possible to have cols header on first line
cols order may change from one run to an other... :(

eras commented 8 years ago

Well, I think that would be plainly impossible to implement and still call it CSV :). Fixed columns is a very defining feature of CSV.

And it would seriously hinder its usefulness as well, how would you ever know the meaning of the columns? Have special rows indicating that new columns have been added? And then incorporate this into your data analysis tools? I'm confident no existing CSV tools or libraries are able to deal with that.

If the user is worried about the number of columns in the CSV (and I'm not sure if there's anything to be worried about..), there are only two possible solutions that I can see: let the user choose the columns (or devices and subdevices) or the user needs to use another format (ie. JSON).

merbanan commented 8 years ago

We need to think this through how csv should be used. I expect that will take a few days for me.

enavarro222 commented 8 years ago

@eras yes, for sure CSV implies fixed columns. I proposed it more to "explore" all possibilities than as a viable solution.

An other possibility is just to remove CSV format ! Then, if one want CSV output, it's trivial to write a python script that take the json output and translate it into a CSV. Note that can be done "online" with a pipe if this script know the expected columns (given with an argparse option). It is also possible offline, then the script can do a first pass to guess all the columns...

eras commented 8 years ago

A script doing the same as what rtl_433 is doing is going to be a bit complicated. Just removing the empty columns is a bit easier (slightly golfed for brevity):

% cut -d, -f $(perl -e '
    <>;
    $max=0;
    while (<>) {
      chomp;
      @fs = split(",", $_, -1);
      $max = $#fs if $#fs > $max;
      $fs[$_] ne "" ? $used{$_}++ : 0 for (0..$#fs);
    }
    print join(",", grep { defined($_) } map { $_+1 if $used{$_} } 0..$max)
    ' file.csv
  ) file.csv

merbanan commented 8 years ago

I think that the sane use for csv is when you activate one specific protocol via the -R option. If you enable more you will get mixed output and you will have to filter it out later.

The scenario is that you want to monitor temperature from several sensors of one protocol type but for some reason you get readings from a sensor that only sends humidity. In that case you will have to remove those records later when you process the data.

So imo the current code is fine. We could maybe implement our own fixed type csv column order but I think it is much easier if the flow would be like the following:

rtl_433 -F csv -R 12 -R 13 >mixed_csv_records.csv csv_filter.pl -f mixed_csv_records.csv -p oregon_scientific >oregon_scientific_records.csv csv_filter.pl -f mixed_csv_records.csv -p rubicson >rubicson_records.csv

Now you can load it up in some spread sheet application and graph your heart out.

Comments ?

enavarro222 commented 8 years ago

We may close this issue.

For me it's ok to stay as it is now i.e. "CSV output might contains useless columns, up to the user to filter it afterwards".

By the way, no need to a custom script, tools as csvkit (with csvcut) do the job:

$ rtl_433 -F csv 2>/dev/null | csvcut -c "model,temperature_C"
model,temperature_C
HIDEKI TS04 sensor,13.100000
HIDEKI TS04 sensor,13.100000
HIDEKI TS04 sensor,13.100000
Thermo Sensor THN132N,18.900000
Thermo Sensor THN132N,18.900000
HIDEKI TS04 sensor,22.000000
HIDEKI TS04 sensor,22.000000

merbanan commented 8 years ago

Excellent, can you add a little documentation part about that ?

enavarro222 commented 8 years ago

@merbanan yep, a wiki page "Tips for processing outputs" is fine ?

merbanan / rtl_433

How to do with Oregon Scientific and new output system ? #212