alexmohr / open-heat

Eqiva valve firmware for esp8266
GNU General Public License v3.0
4 stars 1 forks source link

The valve gets sort of stuck after some time, getting the commands but it is like frozen (which for a radiator is not good at all! HA HA HA!) #33

Open Anycubic opened 2 years ago

Anycubic commented 2 years ago

After hours running fine in heat mode, in one radiator measured temp was keeping increasing like if the valve was dead. Lowered temp by 2 degrees and the valve not reacting. Shut off command was issued but still no noise from the valve. Recycled the power and I heard the valve closing eventually. Going to serial log ...

alexmohr commented 2 years ago

how long did the valve work after the reset?

Anycubic commented 2 years ago

I'd say 2-3 seconds

Anycubic commented 2 years ago

For sure it wasn't stuck at full open position

Anycubic commented 2 years ago

It was closing-opening a few hundreds millis each time, than started to open and got stuck (the code, not the physical valve)

Anycubic commented 2 years ago

It must be something really tricky because the other one is going great:

2021-11-26_21-52-08

alexmohr commented 2 years ago

It probably drifts appart because there is a (small) time between setting the GPIOs and the motor actually spinning. Maybe always adding a small time to the valve operation, say 10-20ms or so would help.

Anycubic commented 2 years ago

But how does it come that motor didn't spin at all when I issued mode = off? I was there trying to listen with my hear close to the valve :-D

Anycubic commented 2 years ago

BTW my test rig is up and running, tomorrow I will serial log all the day long

alexmohr commented 2 years ago

I won't be able to do things for the next couple of days so take your time logging. The motor does not spin because the software assumes it's fully closed already.

Anycubic commented 2 years ago

I see, if in theory is possible that in a few hours it spinned so many seconds then it could be an explanation, though I can't understand why the other one run all of the day long without a single issue. Hope you are going on holyday!

alexmohr commented 2 years ago

I wish I would :laughing: I don't really have an explanation why to be honest. I doesn't happen to my radiator either.

alexmohr commented 2 years ago

I've added a spin up time of 25ms in the old_regulator branch. Please use this for your tests :)

Anycubic commented 2 years ago

Ok, I'l do it tomorrow ;-)

Anycubic commented 2 years ago

Tonight I was thinking... Let's assume the valve THINKS it's closed but it's not...so why don't you add a safety check like

I'm (the valve) think I'm full closed but after 3/4/whatever regulator cycles temperature is actually INCREASING -> I'm not really closed -> issues a 5/7/whatever seconds closeRotation command -> check again after 3/4/whatever regulator cycles -> temperature decreased eventually? Ok now I'm really full closed, let's resume normal operation from closed position -> temperature is keeping on increasing? -> I'm probably broken (motor or temp sensor) let's issue a full closeRotation command, shut off and send an error message e-mail

What do you think? I need to sleep more? Yes you are right!

Anycubic commented 2 years ago

Wait! It happened also to the other valve (didn't had it patched with last commit yet). Power cycled -> switched on and closed for about 5 seconds

Anycubic commented 2 years ago

And BTW after a cycle of off-heat-off I'd close the valve 5 seconds or so MORE than the regulator calculated, just to take into account jitters/mechanical imperfections. Yes, I know the real solution is always the same, being able to detect the close valve position

Anycubic commented 2 years ago

It probably drifts appart because there is a (small) time between setting the GPIOs and the motor actually spinning. Maybe always adding a small time to the valve operation, say 10-20ms or so would help.

Maybe there is another factor: when you give the start impulse the motor will not start spinning at the moment "0" but some millis after (due to mechanical factors there is a transient period). So it will rotate each time a little bit less that it is supposed to be and so movement by movement will go out of synch eventually.

alexmohr commented 2 years ago

That's exactly what the

It probably drifts appart because there is a (small) time between setting the GPIOs and the motor actually spinning. Maybe always adding a small time to the valve operation, say 10-20ms or so would help.

Maybe there is another factor: when you give the start impulse the motor will not start spinning at the moment "0" but some millis after (due to mechanical factors there is a transient period). So it will rotate each time a little bit less that it is supposed to be and so movement by movement will go out of synch eventually.

That's exactly what the latest commit tries to compensate. 25ms are rotated on top of the given time.

Anycubic commented 2 years ago

I see, I thought that those 25ms more were meant to compensate electric transient only. I have the feeling we need to double if not more that number... will it be too time consuming adding a MQTT topic with total number of movements (close+open) done in an entire heat -> off cycle?

alexmohr commented 2 years ago

50ms would be the maximum as it would affect the regulator too much otherwise. For what do you need the topic?

Anycubic commented 2 years ago

When the valve goes crazy I will approximately calculate how many seconds was out of synch/numbers of activation = average milliseconds * activation of transient to be taken into account. I'll do it until we will ALMOST solve this issue. Will it work?

alexmohr commented 2 years ago

Can't you just take that from the serial log? Should be a simple grep. was the 25ms not enough?

Anycubic commented 2 years ago

Is what I'm trying to do, just saying that should I find the sweet spot now, with that number will be easy to tweak again in the future without having to log one day long 😉

Anycubic commented 2 years ago

was the 25ms not enough?

So far so good, BUT now I can see fluctuation that were not spotted before. I wonder if now with those 25ms the algorithm is really showing itself so that it will be easier to fine tune it once for all

Screenshot_2021-11-27-17-34-34-361_io homeassistant companion android

alexmohr commented 2 years ago

Adjustment of close time should fix this but let's wait if this fixes the closing issue

Anycubic commented 2 years ago

I'm attaching 7 hours of log. At the end it seems that the valve jittered "only" 700ms as per last off operation (actually after power cycling the valve really closed for less than a second). Also the other valve with the new code was almost spot on, same beahaviour.

If I'm not wrong 700milis/43 valve movements = approx 16 millis on top the 25. Btw even if the adjustment value is correct I still think that last off command should close the valve a little more than the theoric value, in a way or another there will be always a little jitter (i.e. motors are not exactly all the same).

I also saw a lot of "Minimal sleep or underflow prevented, sleep set to 10000l ms". Why so many and why sleep set to only 10000 ms? Of course the battery isn't going to be happy.

What do you think?

teraterm.log

alexmohr commented 2 years ago

This won't be ever fixed properly with the current hardware. The jitter varies from valve to valve because the motor spin up time depends on the necessary torque which depends on the valve (and probably also the current motor position as a valve which is closed further requires more torque to close fully) that's probably why both your valves differ and I had never such an issue.

It would be possible to just add 50ms and things might work or less if we adjust the regulator. Rotating an additional second in the last close operation before reaching the limit can be more or less easyily.

The minimal sleep prevention happens of the Mqtt sleep time is a multiple of radiator sleep time. Instead of sleeping only a few milliseconds it just sleeps 10s. This is to make sure we turn the WiFi off before running the next loop. For example if the radiator sleep is 0 at the end of the mqtt loop the device would sleep until an external reset is triggered. This is prevented by the underflow prevention.

I haven't checked the logs yet but I will take a look in the next couple of days.

alexmohr commented 2 years ago
Anycubic commented 2 years ago

Flashed on 4 valves ;-)

alexmohr commented 2 years ago

it's still not solved. Turning off just rotates the valve for the 1s I've added as additional time, but after resetting it's still open a lot. 2-3s after 2 days or so of being open. Maybe the 50ms are not enough. I'll try doubling it but I don't think it will solve this.

I got the hardware to implement a stop detection, but this will take a lot of time to develop and test the circuit and the software for it. At the moment I neither have the time nor the motivation to do this. Maybe in the holiday season but at the moment it looks more like I'm going to drop the project altogether and just buy something that works already.

Anycubic commented 2 years ago

Are you serious? 😭 Which code are you working with? I didn't have those jitter since you implemented the correction factor. And no, it doesn't exist anything like what you are accomplish, everything else is closed source/proprietary/expensive

alexmohr commented 2 years ago

Yes I am serious, it's just taking up too much time with too little results.

Closed source / proprietary isn't that important to me as long as I can have the integration in my network as open source and ELV builds such a product. You are right, they are expensive, but come with some advantages like buttons to shut off and don't take 15 minutes to react to a command, which is kinda important for the wife acceptance factor :sweat_smile: Furthermore their typical battery life is around 2 years which I'll never achieve.

If the stop detection will be implemented depends on how time and motivation I find over the holidays but don't get your hopes up. I have a draft for a circuit but that's it at the moment

Anycubic commented 2 years ago

Ok, I understand. But it's a pity making this very nice work (and lot of efforts) just dying in this way. Could you please just help me in finding what's wrong with your latest commits? I think you broke something with latest 2 commits, temperature keeps increasing. Till bec2057 everything was pretty fine apart failing to completely close the valve.

In particular I have this warning while compiling:

src/heating/RadiatorValve.cpp: In member function 'unsigned int open_heat::heating::RadiatorValve::remainingRotateTime(unsigned int) const':
src/heating/RadiatorValve.cpp:270:40: warning: comparison of integer expressions of different signedness: 'unsigned int' and 'int' [-Wsign-compare]
  270 |   if (remainingTime <= 0 || rotateTime > remainingTime)

Or I don't know if the problem was just the 50 millis or anything else you made in latest commits.

Thanks!

Anycubic commented 2 years ago

And BTW I'm more than happy to keep helping you in developing the last step of close detection, should you change your mind just let me know and I will buy the needed parts ;-)

alexmohr commented 2 years ago

And BTW I'm more than happy to keep helping you in developing the last step of close detection, should you change your mind just let me know and I will buy the needed parts ;-)

Thanks but I have everything I need for it except the time and motivation.

Ok, I understand. But it's a pity making this very nice work (and lot of efforts) just dying in this way. Could you please just help me in finding what's wrong with your latest commits? I think you broke something with latest 2 commits, temperature keeps increasing. Till bec2057 everything was pretty fine apart failing to completely close the valve.

In particular I have this warning while compiling:

src/heating/RadiatorValve.cpp: In member function 'unsigned int open_heat::heating::RadiatorValve::remainingRotateTime(unsigned int) const':
src/heating/RadiatorValve.cpp:270:40: warning: comparison of integer expressions of different signedness: 'unsigned int' and 'int' [-Wsign-compare]
  270 |   if (remainingTime <= 0 || rotateTime > remainingTime)

Or I don't know if the problem was just the 50 millis or anything else you made in latest commits.

Thanks!

The warning isn't too bad but I'll probably just change everything to integers just to prevent underflows when casting to unsigned. I'll update this PR soonish 👍

Anycubic commented 2 years ago

Ok thanks. I think that just fine tuning constant values there are all of the bits to make it work pretty fine ;-)

Anycubic commented 2 years ago

I just flashed one valve with static constexpr int m_spinUpMillis = 35; to check if indeed 50 broke everything

alexmohr commented 2 years ago

I just flashed one valve with static constexpr int m_spinUpMillis = 35; to check if indeed 50 broke everything

Good luck with that. 😅

Anycubic commented 2 years ago

Just leave me with a working good enough code (like pre bec2057) and I don't think I will need any luck, you are understimating your work 🥇

alexmohr commented 2 years ago

Don't worry I'll still try to work out the bugs with this hardware.

Anycubic commented 2 years ago

and don't take 15 minutes to react to a command, which is kinda important for the wife acceptance factor 😅

I have the feeling that this is the real reason.... 😄

alexmohr commented 2 years ago

and don't take 15 minutes to react to a command, which is kinda important for the wife acceptance factor 😅

I have the feeling that this is the real reason.... 😄

Not entirely. I'm a bit annoyed myself that I have to schedule when to open the windows 😅 As you've also requested a button to turn off heating. I guess you can understand

The latest commit on this branch should fix some "stuck" behaviour because the remaining time logic was wrong

Anycubic commented 2 years ago

and don't take 15 minutes to react to a command, which is kinda important for the wife acceptance factor 😅

I have the feeling that this is the real reason.... 😄

Not entirely. I'm a bit annoyed myself that I have to schedule when to open the windows 😅 As you've also requested a button to turn off heating. I guess you can understand

Not for my wife because I'm divorced 😄 but for my 2 agritourism room guests it was an interesting addition. But you see, with a "room thermostat" you don't need that button

Anycubic commented 2 years ago

😅 Furthermore their typical battery life is around 2 years which I'll never achieve.

With my 5000mAh batteries I estimated 40 days with WIFI deepsleep = 5 minutes. That's not too bad, considering I will need maybe just 3 recharge x year (including the first one). Also using that modified shield I could recharge them without moving the valve at all, i.e. using a powerbank to recharge them

alexmohr commented 2 years ago

I need that button also with a room thermostat because the thermostat also speaks Mqtt which means 15m delay so there is no real way around have a switch In my opinion.

Regarding battery life time: you have to take into account that lithium ion batteries have a non linear discharge curve.

Anycubic commented 2 years ago

I need that button also with a room thermostat because the thermostat also speaks Mqtt which means 15m delay so there is no real way around have a switch In my opinion.

In my humble opinion the switch thing is feasible, you push the button (not a switch) -> write "off" in topic and a led will blink (very slowly 😄 ). Push again, no led and "heat" on (or led management vice versa). If you manage to wake up the ESP with the "push" this can be "Interactive"

Regarding battery life time: you have to take into account that lithium ion batteries have a non linear discharge curve.

I know very well, I have a BEV car 😄

alexmohr commented 2 years ago

. If you manage to wake up the ESP with the "push" this can be "Interactive"

If it's not interactive it's still useless as it would take up to 15 minutes until the esp wakes up again with WiFi enabled. And if it should be interactive it would have to send a pulse to the reset pin while holding 3.3v to a digital input. And what do you do after you written "off" to Mqtt? The voltage is still present at the pin and you have to put the switch back to heat again

Anycubic commented 2 years ago

Why a switch? A push button would not serve the purpose? Pushing it changes the state "off" -> "heat" -> "off" etc etc. Of course I don't know how and if this is doable, that's why I wrote "you manage" 😄

alexmohr commented 2 years ago

Because a push button only provides one pulse. This pulse could be used to reset the esp and then it's gone. Holding it would keep the voltage at the reset pin and not esp will not boot. Reset is the only way to get back from deep sleep as interrupts on gpio are disabled too. This means a push button would need a circuit behind it which sends a pulse to reset and holds voltage at a gpio pin.

Anycubic commented 2 years ago

This means a push button would need a circuit behind it which sends a pulse to reset and holds voltage at a gpio pin. You see, you know how to do it 🥇 😄