brendan-w / python-OBD

OBD-II serial module for reading engine data
GNU General Public License v2.0
1.05k stars 369 forks source link

0.4.0 with async causes traction control and flat tire warning to trigger randomly #21

Open HawtDogFlvrWtr opened 9 years ago

HawtDogFlvrWtr commented 9 years ago

This didn't happen in 0.2.0, but with 0.3.0+, it triggers a few vehicle warning randomly. After looking at the changes, I think it's the async sleep being changed from 1.0 to 0.25. I believe it's attempting to push multiple commands at once and tripping over each and actually sending false signals to the vehicle that it interprets as an issue.

HawtDogFlvrWtr commented 9 years ago

To explain further, if async is sending 0110 and 0102 at the same time, it may send the command 011002 or something to that effect and trip up the vehicle. I'm going to troubleshoot more later with debug on to see what's up but I did notice yesterday that with debug on and those issues happening, an async call was returning something about "don't understand returned value" or something

brendan-w commented 9 years ago

It's unlikely that the sleep change is causing this. That sleep is only used when no commands are being watched, and it simply prevents the update loop from spinning out of control (while still keeping the thread alive).

While Async doesn't actually push commands simultaneously, it does push them in quick succession (as fast as possible). My guess is that the output buffer is getting dumped too soon, since each query performs a flushOutput(). I'll make a test branch in a minute.

The major change that happened from 0.2.0 to 0.3.0 was enabling the headers on the ELM, so it can distinguish between multiple ECUs. Although, that shouldn't effect that car at all...

brendan-w commented 9 years ago

When you get a chance, checkout the flush branch. It uses a blocking flush, rather than an aborting flush.

HawtDogFlvrWtr commented 9 years ago

yea it was a stab in the dark. I changed it back to 1.0 in my clone and it made no difference. i'll check out the flush branch tomorrow and let you know.

HawtDogFlvrWtr commented 9 years ago

correction, i pulled the flush branch and installed it. i'll TEST it tomorrow when I go for a spin. i'll report back then.

HawtDogFlvrWtr commented 9 years ago

ok so I tested it this morning and it's a hell of a lot better. I only had one warning light, and it was after I turned the car off and back on again after driving it. The initial car on and drive wasn't an issue. Perhaps when i shut the vehicle off, even though i'm stopping async and unwatching all, perhaps the elm still had something partially sent to the console, and when i restarted the async and watching, it sent the original piece and the new watched pieces? I have no idea, but normally while driving my vehicle is going ape, this time, it didn't happen at all when i started the car, I turned the car off after dropping my son off at daycare, and then turned it back on and it immediately dinged once and then worked perfectly there on.

HawtDogFlvrWtr commented 9 years ago

OK, so i drove into work today and a few times it triggered a few things. I noticed that one of the times was when i hit the temperature button as though it triggered it. I know it sounds funny, but it is what it is. Any more ideas? that resolved many of them, but it's still happening a bit.

brendan-w commented 9 years ago

Ok, I made one last simply change that should help. If it still has problems, then we should get a serial sniffing tool to find out what's really going on.

brendan-w commented 9 years ago

oh! I just re-read your code (the stuff you posted in #20), and I remembered that you're making a second serial connection. My code obviously didn't help, but it's possible our two connections are stepping on eachother. Before transmitting, make sure you stop() async, and flush() your serial port between write() and read().


Once I get done implementing #20, you'll be able to do this with custom commands in python-OBD, which avoids all this.

HawtDogFlvrWtr commented 9 years ago

In these cases, I've not actually pushed anything to the vehicle, so that shouldn't apply. I will take that into consideration, though for sure.

HawtDogFlvrWtr commented 9 years ago

OK, I've updated the source the flush branch on my pi. I'll give it a go either tonight or tomorrow morning. Sorry about the delay in testing, I'm a single dad and can't just run out, without a ton of coordination and planning. :)

brendan-w commented 9 years ago

Take your time, no rush!

HawtDogFlvrWtr commented 9 years ago

ok, so i tested it this morning on the way to work (1.5 hour drive) and it did ding once, but i think it was because I was testing my remote start before taking off, to test the flush function being added. BUT, after i dropped my son off at daycare and restarted the car, it didn't ding once on the entire drive, which it normally does about 5 times, and ultimately turns on the traction control light, and can't be turned off without restart the vehicle. so at this point, i think that if i run my code before i start pulling metrics with your code, it causes a ding. I'll work on my end more to see. I'll also test it a little longer without me pushing anything to the vehicle, to see if it happens again.

HawtDogFlvrWtr commented 9 years ago

ok, so i did a little more testing and I may have found an issue with my code that's also causing it to happen. After pulling your metrics, i dump the data to an influxdb server. I had a try, except but in the except, it just looped back into the code if the upload to influx failed. I noticed that when my hotspot lost it's connection, my car went apesh!t. This happened because i never unwatched all the async calls before trying to kick them back off again. It appears that if you try to watch items that are already watched, it makes my car freak out. not sure why this is, but i can replicate it by turning off my hotspot and it immediately starts dinging like mad.

brendan-w commented 9 years ago

I found a bug! and man, it's an embarrassing one...

It turns out, calling start() multiple times (without a corresponding stop()) could result in multiple threads being created, all of which were trying to operate the port.

facedesk

I put the fix on master, along with the flush changes we found before. If it works, I'll probably release an 0.4.1 update.

HawtDogFlvrWtr commented 9 years ago

I can tell that it made a major difference, but I'm still seeing it under certain circumstances. I rewrote all functions related to pulling metrics, to ensure that it wasn't calling start twice or watching more than once. Now i'm only seeing it now and again on vehicle start. It really seems to be a timing thing, as it doesn't happen while i'm driving now. Question, do you ensure that you've sent all AT commands before attempting to pull metric information? The only thing left that I can think, is that it's pushing AT reset commands and attempting to watch things at the same time (because i'm bringing things up quickly and just jumping to it within a millisecond). I say this, because it's only happening right at vehicle start, after my pi restarts. before, it happened at the start of the vehicle and while I was driving.

let me know if that makes sense. I need to enable debugging and watch it via my laptop. Right now i'm doing it via a phone but i can't see the debug information, as it runs on start. I'll try to enable that when i have a second to see where it's dorking out.

brendan-w commented 9 years ago

AT commands are only sent when the connection is first made, and there is a forced wait time while the ELM is initialized, so you shouldn't have to do that.

As for calling things twice, the API is supposed to handle duplicate calls (my apologies for the bug with start()). Duplicate calls allow you to protect bits of code like this:


was_running = async.running
async.stop()

""" your code here """

if was_running:
    async.start()

I'm still suspicious of the second connection that's being made to the port, and the new headers that are being set. Have you tried disabling that, and running only python-OBD code? Just trying to eliminate variables.

HawtDogFlvrWtr commented 9 years ago

yes, I've disabled it on my test vehicle to ensure it wasn't that, which was causing it. Strangely enough, this morning it threw a warning light, but hasn't since. I pulled your latest version and updated and it immediately threw a warning light after starting the vehicle.. but now it hasn't done it since, and I've started and stopped it about 15 times since then. Perhaps there was a partial call in buffer on the elm before the update, so the first start up tried to write additional data to the buffer which made it garbage. I'm not 100% familiar with how it's talking or if that's even possible, for now though, i'm assuming that it was a one time gremlin and that it's working now.

brendan-w commented 9 years ago

Giving you any more grief?

HawtDogFlvrWtr commented 9 years ago

I'm still getting the ding now and again and can't seem to figure out what's causing it. i'm going to switch from async to regular calls for the time being to see if I still get it. It's the only way I can think of, to determine if it's my code . I'm adding each "static" metric to the watch list and merely pulling the information from it every 5 seconds or so, yet it's still dinging shortly after start, sometimes even with my addition obd write function disabled.

def mainLoop(connection, portName, engineStatus):
    dumpObd(connection)
    for metrics in metricsArray:
        if metrics != 'time':
            obdWatch(connection, metrics)  # Watch all metrics

    connection.start()  # Start async calls now that we're watching PID's
    time.sleep(5)  # Wait for first metrics to come in.

    # FIX: NEED TO MAKE THIS HIS 01 TO DETERMINE SUPPORTED PID'S
    while engineStatus is True:
        callBack('1', '1')  # Telling uhacknect the pi is online
        tempValues = []
        timeValue = time.time()
        for metrics in metricsArray:
            if metrics == 'time':  # Grab current time if metric is asking to provide the time.
                value = int(timeValue)
            else:
                value = connection.query(obd.commands[metrics])
                value = value.value
brendan-w commented 9 years ago

I think it's time we find a serial monitor so we can see what's actually going on. If I have time this weekend, I'll see if I can get one up and running.

HawtDogFlvrWtr commented 9 years ago

Any update on this?

brendan-w commented 9 years ago

I started, but ran out of time before I got anywhere worth mentioning. I was looking at using strace to watch the syscalls between python and linux (though, I'm wary of the mess that will be). Learned about it here. Although, there are a few dedicated tools to do serial sniffing, but I haven't played with them yet. Been a little bogged down lately.