lafrech / oem_gateway

Part of OpenEnergyMonitor project. Gateway from data source to target database.
20 stars 19 forks source link

Buffer: store data when network is down #6

Open lafrech opened 11 years ago

lafrech commented 11 years ago

Currently, when the buffer can't reach its target, the data is kept in RAM and the communication retried later. To avoid filling up the RAM, the data is trashed after an arbitrarily hardcoded amount of data.

Possible improvements:

rolfbartels commented 11 years ago

Buffers are working as expected but the time taken to catch up is massive, 3 hours of buffer it taking around 6 hours to catch up.

lafrech commented 11 years ago

This is a bit surprising.

The run() method that sends the data from the buffer is called here, every iteration of the main loop.

The buffer should discharge at a rate of 1 HTTP request every 0.2 seconds. Unless the main loop is very busy. Or perhaps is it urllib2.urlopen() that lasts too long.

You may find a clue in the logs about what is happening.

rolfbartels commented 11 years ago

Here is an extract from my logs you'll see the time difference between the sends, I see that it's between 3-5sec per send. I have also notice now that my feed data is reporting really weird and incorrect number I have a 2KW array so at most the best case I could see -1.6kw grid usage but I am currently seeing -7kw, not sure weather this will correct it's self once all the feeds catch up ?

2013-10-23 13:37:08,655 INFO Sending to emoncms.org 2013-10-23 13:37:11,295 DEBUG Send ok 2013-10-23 13:37:11,520 INFO Serial RX: 20 133 9 2013-10-23 13:37:11,523 DEBUG Node: 20 2013-10-23 13:37:11,525 DEBUG Values: [2437] 2013-10-23 13:37:11,527 DEBUG Server emoncms.org -> send data: [20, 2437] 2013-10-23 13:37:11,529 DEBUG Data string: &node=20&json={1:2437} 2013-10-23 13:37:11,531 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&node=20&json={1:2437} 2013-10-23 13:37:11,533 INFO Sending to emoncms.org 2013-10-23 13:37:13,560 DEBUG Send ok 2013-10-23 13:37:13,562 DEBUG Server emoncms.org -> send again data: [10, 1069, 1222, 0, 23240, 93, 490] 2013-10-23 13:37:13,565 DEBUG Data string: &time=1382512269&node=10&json={1:1069,2:1222,3:0,4:23240,5:93,6:490} 2013-10-23 13:37:13,567 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&time=1382512269&node=10&json={1:1069,2:1222,3:0,4:23240,5:93,6:490} 2013-10-23 13:37:13,569 INFO Sending to emoncms.org 2013-10-23 13:37:16,603 DEBUG Send ok 2013-10-23 13:37:16,836 INFO Serial RX: 10 137 251 102 6 1 0 85 92 158 255 235 1 2013-10-23 13:37:16,839 DEBUG Node: 10 2013-10-23 13:37:16,841 DEBUG Values: [-1143, 1638, 1, 23637, -98, 491] 2013-10-23 13:37:16,843 DEBUG Server emoncms.org -> send data: [10, -1143, 1638, 1, 23637, -98, 491] 2013-10-23 13:37:16,845 DEBUG Data string: &node=10&json={1:-1143,2:1638,3:1,4:23637,5:-98,6:491} 2013-10-23 13:37:16,848 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&node=10&json={1:-1143,2:1638,3:1,4:23637,5:-98,6:491} 2013-10-23 13:37:16,850 INFO Sending to emoncms.org 2013-10-23 13:37:18,773 DEBUG Send ok 2013-10-23 13:37:18,776 DEBUG Server emoncms.org -> send again data: [20, 2250] 2013-10-23 13:37:18,778 DEBUG Data string: &time=1382512270&node=20&json={1:2250} 2013-10-23 13:37:18,780 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&time=1382512270&node=20&json={1:2250} 2013-10-23 13:37:18,782 INFO Sending to emoncms.org 2013-10-23 13:37:20,787 DEBUG Send ok 2013-10-23 13:37:21,026 INFO Serial RX: 10 138 251 99 6 0 0 161 92 158 255 233 1 2013-10-23 13:37:21,030 DEBUG Node: 10 2013-10-23 13:37:21,034 DEBUG Values: [-1142, 1635, 0, 23713, -98, 489] 2013-10-23 13:37:21,039 DEBUG Server emoncms.org -> send data: [10, -1142, 1635, 0, 23713, -98, 489] 2013-10-23 13:37:21,043 DEBUG Data string: &node=10&json={1:-1142,2:1635,3:0,4:23713,5:-98,6:489} 2013-10-23 13:37:21,047 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&node=10&json={1:-1142,2:1635,3:0,4:23713,5:-98,6:489} 2013-10-23 13:37:21,051 INFO Sending to emoncms.org 2013-10-23 13:37:22,316 DEBUG Send ok 2013-10-23 13:37:22,320 DEBUG Server emoncms.org -> send again data: [10, 1072, 1225, 1, 23235, 93, 491] 2013-10-23 13:37:22,324 DEBUG Data string: &time=1382512274&node=10&json={1:1072,2:1225,3:1,4:23235,5:93,6:491} 2013-10-23 13:37:22,328 DEBUG URL string: http://emoncms.org/input/post.json?apikey=&time=1382512274&node=10&json={1:1072,2:1225,3:1,4:23235,5:93,6:491} 2013-10-23 13:37:22,333 INFO Sending to emoncms.org

rolfbartels commented 11 years ago

Yip something is a miss,

I turned of my solar array production at aroung 12:30, however while the buffer was running and still being cleared out the graphs were reporting that I was still production power. As soon as I rebooted the Pi, to restart the oem_gateway, the feeds started to reprot correct values ?

http://emoncms.org/vis/auto?feedid=20559

Unfortunately I am not a coder so have limited experience in ty to read and resolve some of the code, would love to have helped. But I sure can test.

lafrech commented 11 years ago

From your logs, it appears that the delay is in the function that sends the data to emoncms. In other words, the limit would be the time emoncms takes to answer.

It takes about 3 seconds to send each set of samples, and you get a new set of samples every 5 seconds. So it takes more than half the time to send the data realtime. Which leaves you less than half the time to send buffered data. Therefore, it takes more than one hour to send one hour of data. That's a bit simplistic, but you get the idea. It is a matter of bandwidth, and the bottleneck here seems to be the communication with emoncms.org.

I'm no network guru, and I may have been optimistic when thinking that the data could be sent realtime over the network. A better approach could be to send several data sets in the same http request. Unfortunately, we're limited by the API: the syntax to send several sets at once does not allow feed names. I moved to named feeds syntax to prepare another feature that does not exist yet and perhaps never will. Perhaps should I step back. Or perhaps could the API be improved to allow sending values for named feeds from different timestamps all at once.

See here: http://openenergymonitor.org/emon/node/2249

(Or perhaps the API already changed and I don't know it...)

lafrech commented 11 years ago

I don't have the time right now to investigate your other question, but I'm thinking that there may be issues with some post-processing operations when sending the data in mixed order. Like when the network goes back up and the new and buffered samples are sent somehow interleaved. I may be wrong. Just thinking out loud.

I could avoid this, and prioritize buffer flushing, which would make the code even simpler, actually...

There would still be a delay, but the processing functions in emoncms should be delay-proof. I hope.

lafrech commented 11 years ago

For the record, related forum thread link.

lafrech commented 11 years ago

Regarding your power production feeds, can you please list the post-processing operations from input to feed (like sum/difference/multiplication/average/...) ? We may want to check whether these would be screwed up by either delay or "interleaving".

rolfbartels commented 11 years ago

I am taking reading at my supply from the utiliy

log; kwh; kwhd; +input"Solar"; log; kwh;kwhd

First Log is the grid, then I add the solar production to it to get the actual usage. for the solar it's log kwh kwhd

lafrech commented 11 years ago

The power to kwh function calculates the energy by multiplying the power by the time elapsed. The time elapsed is the time of the sample minus the time of the last sample. Without explicit sorting, I don't know which is the last. Probably the last received, not necessarily the one with the most recent timestamp. Anyway, this logic is broken if samples are sent not in chronological order.

Current implementation is broken by design. We need to flush the buffer before sending new samples. I think it is an easy fix.

This does not really explain why your graph would show production after the production actually stopped, though, as 0 multiplied by any amount of time is still 0.

rolfbartels commented 11 years ago

Cool,

Looking at my dashboard and stats, I can't see why I actually need kwh, unless I want to track the total over time ? I could just remove it and see again, however this would not solve the code and could potentially cause issues for others ?

lafrech commented 11 years ago

I just committed a fix to make sure the data is sent in chronological order: https://github.com/Jerome-github/oem_gateway/commit/06f07896c0f8f821cdb77ee01f8a2d7306c68909

Would you mind checking whether you still have the same kind of trouble ?

rolfbartels commented 11 years ago

Cool will test now, I see that you oemgateway is missing the extension .py which will cause the script not to load from the rc.local file at reboot.

lafrech commented 11 years ago

Thanks.

Yes, the filename modification is part of the changes in recent commits. I modified oemgateway.init.dist (the model) accordingly, but you're right, this change is not automatically made into the init file. I don't really have a solution for that, except try to avoid it when possible.

rolfbartels commented 11 years ago

Cool, I have fixed mine to use the init script.

Initial testing 20 mins show that it seems to be reporting the correct values now which is good, I'll do a longer test over night and report once complete. I still can't seem to see why the gateway is only sending values every 2-3sec ? Can you perhaps check on your side with the logs ?

lafrech commented 11 years ago

I think it takes 2-3 secs because the call of urllib2.urlopen() lasts 2-3 secs. This can be due to emoncms.org taking too much time to answer.

See

2013-10-23 13:37:13,569 INFO Sending to emoncms.org
2013-10-23 13:37:16,603 DEBUG Send ok

I'm not sure I can do much about it.

Maybe the only reasonable answer is to group http requests: send several sample sets in one call. It's what I tried to explain here but I was probably not clear. Anyway, I'll be thinking about it and I'll try to come up with a solution, as I deem this a blocking issue.

rolfbartels commented 11 years ago

Thanks, I understand now, I have been testing 13 hours of network issues, and since Trystan has now fixed the emoncms server I have started uploading again, the values are looking good and the updates are pushing through. I'll let you know once it's all complete and the final results but good work and thanks.

With regards to the 2-3sec delivery time could this not maybe be an issue with the emoncms server performance, maybe when Trystan moves it to a new dedicated server it might improve ?

rolfbartels commented 11 years ago

An Idea could be to only save to buffer 15s intervals, having some data is better than no data. You could also make this configurable in the oemgateway.conf, so you would only save to buffer if the time stamp is > than the defined value from the last sent timestamp. ? Currently it looks like I am trying to upload 5sec intervals, every 2-3sec, while still receiving new RX from emonTX. could be implemented as a temporary fix till you have time and a better solution is worked.

lafrech commented 11 years ago

I just pushed a few commits you might want to check out.

I added back a period between data posts. You'll need to add the "period" parameter to your config file.

Each 30 seconds (for instance), the data sets are sent all at once to the server. This way, a 5 seconds delay is not much of an issue.

I limited the number of data sets to be sent at once to 100, so that the URL is not one kilometer long after a long network downtime.

Thank you for your feedback.

Glad your power calculation was fixed by former modifications.

Perhaps the answer delay could be improved on server side, but the gateway needs to be able to work on any network conditions anyway.

rolfbartels commented 11 years ago

Thanks Jerome, Initial tests seems to work well, I'll no simulate an outage of a few hours, and feedback, I can't see how you are handling the timestamps, but if the data is correct then that's all good. Shot !

rolfbartels commented 11 years ago

Hi Jerome, I think there is a problem in the code ?, I disabled the network at 13:25, and started it again at 14:25, I can see the data being uploaded but not in the time slot between 13:25 to 14:25, it's all been added from 14:25 onwards making a very dense zigzag on the charts, I also notice that the minute I re-enabled the network data was being added at that time point and not starting from when it went down. I also notice that if I have it correctly the time offeset "first value" does not seem to be sequential between sends, so 1 send will have -700 to -400 and the next send will also have the same range. I would of expected -700 to -601 then -600 to -501 ect.

Shot !

rolfbartels commented 11 years ago

Hey Jerome,

I have found a new bug, I rolled back to the previous version to before the bulk send and I believe the issue will occur in both releases, I unplugged my internet connection due to a thunder storm and when I reconnected it I found that the oem_gateway did not resume sending values, in the logs it was held up at "INFO Sending to emoncms.org" I think it was waiting for a response from the server and I must have unplugged before it received a response and thus never went any further. I did a bit of reading and I think you need to add a time out to the urllib2.urlopen something like this result = urllib2.urlopen(url_string, timeout=360 ) I add ed 360 it could be less for the version I'm running for hte bulk send it would probably need to be more, You would probably need to look at the error handling of this as well

Shot

lafrech commented 11 years ago

Hi.

I didn't have much time to investigate the network failure behavior. I shall try to reproduce. Perhaps do you have the relevant part of the log ?

Regarding the timeout, you might be interested in issue https://github.com/Jerome-github/oem_gateway/issues/1. Basically, there should be a default timeout anyway, and all of this is managed outside of my code, in libraries. Perhaps adding a timeout in urllib2.urlopen() would solve this, though. Wouldn't hurt anyway. Could be checked if the issue is reproducible.

Looks like the timeout in the library does not manage all kind of issues. We could wrap urllib2.urlopen() in a function with another timeout we could rely upon. I wanted to do this with a very short time (like 2 seconds) to prevent urlopen() from stopping the whole execution 10 seconds, but from your experience, emoncmc.org, for instance, can be long to reply, so we may not want to do that. However, if we need to do this to avoid the program being stuck as you experienced, then why not add a long timeout, like a minute or so...

rolfbartels commented 11 years ago

Hi

Yes the golbal time does not seem to function, I add the timeout to eh urlopen and this worked for me, 360 is excessive, 15 sec would work.

With regards to the bulk data send then the timeout would need to be adjusted accordingly. And just to itterate the last look of patches you big to send data as a bulk update is not sending the data FIFO, and is not sending the data with the time stamp.

Thanks Again !

lafrech commented 11 years ago

OK I added a 60 s timeout. This is for the specific case you describe and this does not really address issue https://github.com/Jerome-github/oem_gateway/issues/1.

I don't really understand your second paragraph. I just tested and it seemed FIFO to me. And it does send a timestamp (not a timestamp actually, but rather (timestamp - now), which is a negative value meaning "a few seconds in the past").

When you get the time, would you mind providing me an excerpt of the log showing what goes wrong ?

Thanks.

rolfbartels commented 11 years ago

Hi have just tested again and something is still not right, I edited tyhe /etc/hosts file and added a fake entry for emoncms.org at 13:12 to simulate an issue posting to the site, left it like that for 10mins and then removed the entry. from the screenshots you can see where it stopped and started and you can see that it never inserted the missing date.

1 network down 13h10 2 network start

I'll post the log file in the next comment

rolfbartels commented 11 years ago

2013-10-28 13:24:24,448 DEBUG Data string: [[-756.59,10,580,1285,0,23535,92,265],[-756.38,20,2375],[-754.11,10,595,1278,3,23551,93,270],[-748.34,10,607,1277,0,23598,93,274],[-746.49,20,2375],[-742.58,10,546,1324,0,23521,92,251],[-736.86,10,463,1376,1,23594,89,220],[-736.64,20,2375],[-731.08,10,545,1334,5,23562,91,251],[-726.56,20,2375],[-725.32,10,594,1287,0,23581,93,269],[-719.61,10,554,1300,3,23581,92,253],[-716.52,20,2375],[-714.03,10,306,1545,0,23600,78,165],[-708.28,10,185,1501,0,23670,66,118],[-706.61,20,2375],[-702.33,10,502,1379,0,23589,91,234],[-696.78,10,605,1272,0,23571,93,273],[-696.57,20,2375],[-691.0,10,634,1193,1,23587,94,283],[-684.27,20,2375],[-684.04,10,681,1198,0,23572,94,304],[-679.51,10,701,1185,0,23585,95,312],[-676.62,20,2375],[-673.73,10,734,1146,1,23548,95,325],[-666.16,10,723,1166,0,23566,95,321],[-665.94,20,2375],[-662.25,10,691,1158,2,23578,95,307],[-656.48,10,746,1142,2,23573,95,330],[-656.27,20,2375],[-648.08,10,747,1141,3,23593,95,330],[-646.63,20,2375],[-645.19,10,746,1127,0,23556,95,330],[-639.42,10,781,1088,0,23537,96,345],[-636.53,20,2375],[-633.64,10,947,1048,0,23512,94,427],[-627.93,10,968,1032,1,23504,94,435],[-626.49,20,2375],[-622.17,10,947,1032,1,23513,94,426],[-616.61,20,2375],[-616.39,10,928,1056,0,23547,93,419],[-610.68,10,894,1096,2,23617,93,405],[-606.56,20,2375],[-604.91,10,886,1118,1,23614,93,402],[-599.13,10,868,1120,1,23617,93,395],[-593.82,20,2375],[-593.39,10,901,1081,1,23603,93,408],[-587.63,10,896,1087,1,23600,93,406],[-586.59,20,2368],[-582.07,10,954,1026,1,23609,94,428]] 2013-10-28 13:24:24,453 DEBUG URL string: http://emoncms.org/input/bulk.json?apikey=xxxxxxxxxxxxxxxxx&data=[[-756.59,10,580,1285,0,23535,92,265],[-756.38,20,2375],[-754.11,10,595,1278,3,23551,93,270],[-748.34,10,607,1277,0,23598,93,274],[-746.49,20,2375],[-742.58,10,546,1324,0,23521,92,251],[-736.86,10,463,1376,1,23594,89,220],[-736.64,20,2375],[-731.08,10,545,1334,5,23562,91,251],[-726.56,20,2375],[-725.32,10,594,1287,0,23581,93,269],[-719.61,10,554,1300,3,23581,92,253],[-716.52,20,2375],[-714.03,10,306,1545,0,23600,78,165],[-708.28,10,185,1501,0,23670,66,118],[-706.61,20,2375],[-702.33,10,502,1379,0,23589,91,234],[-696.78,10,605,1272,0,23571,93,273],[-696.57,20,2375],[-691.0,10,634,1193,1,23587,94,283],[-684.27,20,2375],[-684.04,10,681,1198,0,23572,94,304],[-679.51,10,701,1185,0,23585,95,312],[-676.62,20,2375],[-673.73,10,734,1146,1,23548,95,325],[-666.16,10,723,1166,0,23566,95,321],[-665.94,20,2375],[-662.25,10,691,1158,2,23578,95,307],[-656.48,10,746,1142,2,23573,95,330],[-656.27,20,2375],[-648.08,10,747,1141,3,23593,95,330],[-646.63,20,2375],[-645.19,10,746,1127,0,23556,95,330],[-639.42,10,781,1088,0,23537,96,345],[-636.53,20,2375],[-633.64,10,947,1048,0,23512,94,427],[-627.93,10,968,1032,1,23504,94,435],[-626.49,20,2375],[-622.17,10,947,1032,1,23513,94,426],[-616.61,20,2375],[-616.39,10,928,1056,0,23547,93,419],[-610.68,10,894,1096,2,23617,93,405],[-606.56,20,2375],[-604.91,10,886,1118,1,23614,93,402],[-599.13,10,868,1120,1,23617,93,395],[-593.82,20,2375],[-593.39,10,901,1081,1,23603,93,408],[-587.63,10,896,1087,1,23600,93,406],[-586.59,20,2368],[-582.07,10,954,1026,1,23609,94,428]] 2013-10-28 13:24:24,457 INFO Sending to emoncms.org 2013-10-28 13:25:01,312 DEBUG Send ok

lafrech commented 11 years ago

Thanks for the log. (I edited to remove your apikey.)

It looks fine to me. The data is sent FIFO, with times from -750 to -580, which is about 10 minutes back in time.

I'm wondering. Is the bulk method still valid, could this be related to timestore ? I haven't switched to timestore yet.

Can you please try something like

http://emoncms.org/input/bulk.json?apikey=xxxxxxxxxxxxxxxxx&data=[[-10,69,12],[-5,69,79]]

and see what happens ?

This should create a node 69 (change if you already have one) and add dummy values 12 and 79 a few seconds before. I think you get the idea. I'd like to see if this syntax is still working. Play with this and see if you get what you'd expect.

rolfbartels commented 11 years ago

Hey, It does create the input, and if I log it to a feed, I can change the first value of each series and the feed reports update 3s ago or so, I would expect if I changed the -10 to say -600 I should see it say updated 10mins ago, this is what we were seeing before we were doing bulk updates. it looks like not matter what offset we set it too it takes it as submitted now.

rolfbartels commented 11 years ago

I have posted on the forums to see if someone else knows what's up http://openenergymonitor.org/emon/node/3027

TrystanLea commented 11 years ago

emoncms.org is having load problems at the moment which may debugging this particularly difficult. But what do you think of the idea of changing the bulk packet format to actually include the real timestamp as it might simplify bulk data generation and parsing, what do you think? It would mean the raspberrypi and server would need to be on time.

rolfbartels commented 11 years ago

This could be an issue as for example I am GMT+2, this could actually be my problem, as we are testing and sending a data set with an offset of 10mins, but currently I am 1 or 2 hours ahead of you ?

lafrech commented 11 years ago

For anyone catching up who wouldn't want to read previous posts, the OEM Gateway was using this syntax:

http://domain.tld/emoncms/input/post.json?apikey=12345&time=132165465465&node=10&json={1:1806,2:1664}

Now, because the server response can be long as compared to the sampling period (like 3 seconds when sending a sample every 5 seconds) I want to re-introduce the bulk mode used in the python RFM2Pi Gateway.

which would look like

http://domain.tld/emoncms/input/bulk.json?apikey=12345&data=[[-10,10,1806],[-5,10,1806],[0,10,1806]]

which is how the RFM2Pi proceeds (so I guess it is broken as well unless used with period 0 seconds !).

And by the way, when switching to bulk mode, we lose the input name information, which is not used right now but could be. It could be useful in some cases to have a syntax with the features of both: named inputs and multiple times.

But what do you think of the idea of changing the bulk packet format to actually include the real timestamp as it might simplify bulk data generation and parsing

Fine with me. It would be a bit more simple, but the code is not complex anyway, so I don't mind.

It would mean the raspberrypi and server would need to be on time.

Good point. Didn't think about that. Could this be an issue for other gateways (Nanode) ?

It is good practice to use a ntp server, so assuming the gateway is correctly synchronized is realistic (especially if configured on ready-to-go SD cards). As long as this does not break the other possible gateways, we can go for this.

Note: to account for the sync difference, we could add a &now=21321321 parameter to let the gateway inform the server of its reference. Or is this too ugly ?

This could be an issue as for example I am GMT+2, this could actually be my problem, as we are testing and sending a data set with an offset of 10mins, but currently I am 1 or 2 hours ahead of you ?

AFAIK, this shouldn't be an issue. timestamps are universal time, local-time-proof. Each machine converts universal time to local time based on timezone. Basically. The timestamp we use, like time.time() in python, is the number of seconds since 1st of January 1970 GMT+0 (http://en.wikipedia.org/wiki/Epoch_%28reference_date%29#Computing).

TrystanLea commented 11 years ago

Ok emoncms.org has been sorted out, the server has been upgraded, the response times this morning have been aroun 100ms instead of 3-4 seconds :) Next I will try and familiarise a little with python so I can help.

lafrech commented 11 years ago

Great.

Python is easy, although sometimes tricky, even cryptic, but I don't use complicated python specific syntax because I'm not familiar enough and because I don't want to obfuscate my code. Please ask if anything is unclear.

My proposal here is an emoncms API modification to allow for time-shifted bulk sending, or any other mechanism you'd find equivalent or better.

TrystanLea commented 10 years ago

Hello Jerome, Sorry for slow reply, I've had a bit of fun coding in python and tried creating a mini web app with webpy, but havent got any closer on the questions on hand above.

Im still not sure about this timestamp vs current implementation using incrementing time index idea and think maybe for now we keep it as it is. There are a couple of things that need to be sorted on the emoncms end to deal better with bulk uploads, i.e lets say emoncms.org goes down for a couple of hours and then once it comes back up 100's of raspberrypi's try to simultaneously send all their buffered data, we've been discussing here the idea of using a redis queue that all these pi's could dump their buffered data into on emoncms which could then be worked through slowly with a worker process.

I'm wondering if by default for now until emoncms.org has a better system implemented for dealing with simultaneous bulk uploads buffering could be set to disabled as default. For people posting to their own server's though maybe its worth having the option to enable it.

rolfbartels commented 10 years ago

I can see that should the emoncms go down this could cause 100's of Pi's flooding the server and that needs to be taken into account. Currently the buffer is trying to keep all the data, maybe it should only keep every 5,10 or 15 sec interval information, this should reduce the amount of data that has to be flushed to emoncms. As I mentioned before maybe a proxy service that the Pi's write to and this then flushes to the emoncms DB, or something.

Good work though guys.!!!!

lafrech commented 10 years ago

Hi guys.

I understand your point, Trystan.

I think that bulk sending may decrease the CPU load as compared to sending one value at a time, so it is still an interesting feature.

I guess I should

lafrech commented 10 years ago

I just reverted latest commits about bulk mode (as well as another commit about flushing the buffer when active is set to false).

Next, I should add a parameter to enable/disable the buffering.

We can add a note explaining the users that they may use the feature

If bulk mode gets to be supported, it may be easier to manage a lot of data at a time.

lafrech commented 10 years ago

I think I found what is "wrong" with bulk mode.

input/bulk.json?data=[[0,16,1137],[2,17,1437,3164],[4,19,1412,3077]]

In this string, emoncms assumes that the last data ([4,19,1412,3077]) corresponds to the time the json is sent (ie: now).

I suppose this is handy for the nanodeRF, because it can build its string one sample after another.

I think Trystan does the same in the raspberry_run.php.

I used it another way in the Pi:

input/bulk.json?data=[[-4,16,1137],[-2,17,1437,3164],[0,19,1412,3077]]

This works fine just as well. And it worked as long as the last sample of the string was "now".

Problems come when we send old data from buffer in multiple sends:

input/bulk.json?data=[[-54,16,1137],[-52,17,1437,3164],[-50,19,1412,3077]]

Here, emoncms will assume that -50 is now.

How could improve this without breaking nanodeRF and raspberry_run.php ?

Create another bulk mode and keep the old one for compatibility ? (duplicates code, exposes complexity)

Add a time delay parameter ? (easy and safe, my favourite)

Deal differently with negative time values ? (dodgy)

Use "real-time" timestamps ? Breaks nanode as well. This would mean the emitter needs to be synchronized, or to send its absolute time like &now=1315487564.

No time for this right now but I wanted to write this down.

Currently, the oemgateway uses json and absolute time, which raises synchronization issues (https://github.com/Jerome-github/oem_gateway/issues/12). We could add a &now parameter to the json string (easy and safe), but we'd still wouldn't be able to send multiple data in a request. Means no buffering capability.

Or we could use bulk mode with a time delay parameter. Then we'd lose json's capability of naming inputs (nobody uses it with oemgateway currently).

(Losing input naming is not that bad. I thought it was nice as it allowed to send only a subset of data for a node. It is useful if you have a set of sensors on the Pi, for instance, and don. But for this, you might as well declare a node per sensor.)

My proposal: add &delay to bulk mode to authorize sending old stuff.

Then we could use bulk mode in the gateway. This is not only useful for long downtime. When the urlopen() function fails, it can take 10 seconds. We want to be able to retry without artificially shifting the sample 10 seconds and screwing the power calculation.

I suppose we could disable buffering by default, or set a small buffer size, to avoid server congestion.

Trystan, any thoughts ?

lafrech commented 10 years ago

BTW, maybe we could clarify the API help to prevent people from using the bulk mode incorrectly like I did.

If you agree with the new delay parameter, I can try to send a pull-request when I get the time.

lafrech commented 10 years ago

Here's the pull request for the offset parameter: https://github.com/emoncms/emoncms/pull/118