b-rad-NDi / Ubuntu-media-tree-kernel-builder

Slip stream the latest LinuxTV.org media drivers into an installable Ubuntu kernel package
94 stars 9 forks source link

QuadHD USB not stable on Debian #99

Open brad112358 opened 4 years ago

brad112358 commented 4 years ago

Greetings. I recently purchased one of these to add some tuners to my MythTV setup and to test it for a couple of friends who are interested in MythTV as I want them to have a good experience.

I applied patches/mainline-extra/tip/90.usb.quadhd.xml692/* to my stock Debian kernel source (4.19.98 based). It built fine, and seemed to work at first, but as soon as the MythTV backend closed the device, it stopped working. I have been able to get it to work longer by commenting out the sleep hook in the mxl692_ops structure. It seemed that once it was put to sleep, it never worked again. I'm still seeing lots of errors in the kernel logs. A partial log is attached.

Have I missed another patch I need to get this working correctly? At one point, I tried patching with the full set of Ubuntu-bionic-4.15.0 patches, but even after fixing most of the obvious patch failures, the resulting kernel seemed to have the same issue.

Most of the errors seem to be related to i2c read/write errors.

Thanks! --Brad dmesg.log

b-rad-NDi commented 4 years ago

I just pushed up a change to address that, for now. I also added locking to one function that seemed appropriate.

If you could regenerate the logs, with only debug in drivers/media/ enabled with +pfl that'd be great.

b-rad-NDi commented 4 years ago

Also pushed a fix that was probably the cause of the "positive" errno being percolated back to the usb bridge. Would be curious to see though to know the meaning of status 10,14,18. In your case 18 was critical.

brad112358 commented 4 years ago

I'm still seeing tuning failures. I've uploaded a new log with file "drivers/media/ +pfl" and "drivers/media//* +pfl" I hope that catches all the files you wanted. Thanks!

brad112358 commented 4 years ago

FYI, the log I sent today didn't include your errno fix or the print rx header status error code change. as I pulled before they were pushed. Let me know if you need me to update and gather more logs tomorrow. Thanks.

b-rad-NDi commented 4 years ago

not suprised you still see the errors, i'm just trying to get the instrumentation top notch. It would be ideal if we could isolate your failures to very specific spots.

b-rad-NDi commented 4 years ago

It's crazy to me that mythtv polls signal statistics at times every 10-50ms. There is absolutely no reason to be constantly banging on a device like that. That is ridiculous.

Can you describe your failure use case? Are all four tuners active and recording? Are they all being used for a "distributed" epg collection?

I'll have to instrument further to catch the failing op code.

b-rad-NDi commented 4 years ago

These lines are noisy and should be turned off:

dvb_frontend_do_ioctl:1995 dvb_frontend_handle_ioctl:2398 dvb_frontend_get_event:287 dev_dbg_obj:102 dev_dbg_obj:114 dev_dbg_obj:126

I'm going to further instrument the driver after analyzing your log further, so don't regenerate yet.

Could you confirm you included commit 2f5f59e? I was expecting to have fixed the "unknown error" message the em28xx bridge is printing by ensuring all error codes are proper negative return codes...but something is still sending an invalid return to the bridge.

brad112358 commented 4 years ago

No, that commit was not included which I mentioned in my last comment.

b-rad-NDi commented 4 years ago

ah i misread.

b-rad-NDi commented 4 years ago

ok, commit f0884168 contains new debug output in case of various failures, shows the opcode which failed, and returns error codes on the v3 stat functions if they fail. This should be fairly thorough.

brad112358 commented 4 years ago

As far as use case, sometimes all 4 tuners are recording at the same time with my normal list of recorded programs. I suspect they are frequently used for epg collection, but I've never really investigated the pattern of epg use. I think I've seen failures with less than all 4 tuners recording, but to reproduce it, I've been trying to stress it when I have the extra debug turned on, so I usually make sure 3 are recording, then switch in and out of live TV on the 4th tuner. Sometimes it takes a while to cause the first tuning failure.

It sounds like you are ready for another log so I'll pull and rebuild now.

b-rad-NDi commented 4 years ago

Your v5 log appears like nothing is working whatsoever. I'll have to give it a test on my system. I didn't think anything I did should affect it so adversely.

btw, could you also disable the lines I requested a few posts back via dyndebug?

brad112358 commented 4 years ago

kernel_5.log.gz includes up to commit 8172735283d436359d22ce519e7dbb754f8f0947 with the following dynamic debug selections in effect:: echo "file drivers/media/ -pfl" > /sys/kernel/debug/dynamic_debug/control; echo "file drivers/media// -pfl" > /sys/kernel/debug/dynamic_debug/control; echo "file drivers/media/// -pfl" > /sys/kernel/debug/dynamic_debug/control Doesn't seem like it included much output after I started MythTV

brad112358 commented 4 years ago

This test did record 4 programs from 3 tuners (I think two of the programs were on the same frequency) I'm not sure why tere wasn't much output. I think the first thing it tried to record failed though.

b-rad-NDi commented 4 years ago

-pfl means disable

I'm not sure of the meaning of the second and third lines...is something missing?

You should be using something similar to this:

echo "file drivers/media/* +pfl" > /sys/kernel/debug/dynamic_debug/control
echo "func dvb_frontend_do_ioctl -p" > /sys/kernel/debug/dynamic_debug/control
echo "func dvb_frontend_handle_ioctl -p" > /sys/kernel/debug/dynamic_debug/control
echo "func dvb_frontend_get_event -p" > /sys/kernel/debug/dynamic_debug/control
echo "func dev_dbg_obj -p" > /sys/kernel/debug/dynamic_debug/control
brad112358 commented 4 years ago

Opps. I cut and past the line I turned it back off with. I had "+". Also, I misunderstood and thought you were disabling some things in the code. Sure, I'll try to squash those next time.

b-rad-NDi commented 4 years ago

I'm just disabling the noise, while leaving other "non noisy" core debug. Should cut down the output by thousands of lines.

brad112358 commented 4 years ago

Yes, something is missing. For some reason, the paste I did of my comand line lost some *

brad112358 commented 4 years ago

Looks like they were interpreted as markup. I used this:
echo "file drivers/media/* +pfl" > /sys/kernel/debug/dynamic_debug/control; echo "file drivers/media/*/* +pfl" > /sys/kernel/debug/dynamic_debug/control; echo "file drivers/media/*/*/* +pfl" > /sys/kernel/debug/dynamic_debug/control

b-rad-NDi commented 4 years ago

ok, unsure if you need to have all three lines...maybe the star is throwing something off. That is fine, just add the lines disabling the noisy functions

brad112358 commented 4 years ago

When I tried with just the first command (single /*), it seemed to miss all the subdirectories

brad112358 commented 4 years ago

I hope it's not a Heisenbug, I created this script, ran it, unplugged and restarted mythtv and I can't seem to reproduce a tuning problem yet. I'll keep trying

#! /bin/bash
enable=$1
if [ -z "$enable" ]; then
    enable="+"
fi
ctl=/sys/kernel/debug/dynamic_debug/control

echo "file drivers/media/* ${enable}pfl" > $ctl
echo "file drivers/media/*/* ${enable}pfl" > $ctl
echo "file drivers/media/*/*/* ${enable}pfl" > $ctl
echo "func dvb_frontend_do_ioctl -p" > $ctl
echo "func dvb_frontend_handle_ioctl -p" > $ctl
echo "func dvb_frontend_get_event -p" > $ctl
echo "func dev_dbg_obj -p" > $ctl
b-rad-NDi commented 4 years ago

the alignment of the stars and planets has slightly changed, give it time

brad112358 commented 4 years ago

OK. I've uploaded two new logs. kernel_6.log.gz is when I couldn't get it to fail. kernel_7.log.gz is after I lost patience and rebooted and was able to trigger a tuning issue fairly quickly. Hope this provides useful info! Thanks

b-rad-NDi commented 4 years ago

Ok, so

I'm guessing it's critical at that point.

Strange thing is why does only the first pair ever fail?

Why does mythtv suddenly go from polling stats once per second to an absurb once per 50ms?

What is the meaning of the bridge error that always appears before tuner failures?

Why does the second em28xx bridge never fail.

b-rad-NDi commented 4 years ago

also, why do i never print out failures from the various read_* functions when they are failing

b-rad-NDi commented 4 years ago

is there any possible way you can exclude the first pair of devices from mythtv? I'd really like (for my own sanity) to see some failures from the second pair of devices.

I'm gonna have to install mythtv on my test machine too :-/ It's hdd is tiny so it doesn't make a very good recording machine, but I need to make sure I see your same errors.

b-rad-NDi commented 4 years ago

ahhh good ole mythtv...install it and it won't even start. ugh. So much for that being easy.

brad112358 commented 4 years ago

Yes, I can easily exclude the first two dvb devices. Do you have any more changes you'd like to make before I try that?

I'm familiar with MythTV only as a long time user, not as a coder. You may be surprised to hear that it's always worked well enough fo me that I've never been tempted to try to fix any bugs myself. So, I'm not sure why the polling rate goes up. I haven't even looked at the kernel logs in detail, but from what you said, perhaps it encountered an error and the code is reacting badly and retrying continuously. Perhaps I can help with getting your MythTV install going. Is it the server which won't start? What is the error message? Did you start it from the command line? Did you try --setloglevel debug?

brad112358 commented 4 years ago

Also, did mythtv-setup work OK?

brad112358 commented 4 years ago

I uploaded kernel_8.log.gz. Before connecting the 4 tuner device, I ran mythtv-setup and configured the first two dvb devices to be associated with no input which should prevent MythTV from using them much. I restarted the backend a few times and recorded some things, but I was unable to reproduce a tuning problem. Then, I ran mythtv-setup again and configured all four tuners to be my antenna input again. But this time, I changed the preference order for the tuners. This should cause the order in which the tuners are used for recording programs to change. When I restarted the backend, there were 4 recordings it tried to start when it came up. Two recordings reported delays in tuning and one of these never succeeded in getting a lock. I restarted the server one more time, and this time a different channel failed to get a lock. I didn't check the log to try to see which tuner failed. Let me know what you see. Perhaps the issue is a race somewhere when all 4 tuners are started at once? Let me know if you need me to try anything else. Thanks!

b-rad-NDi commented 4 years ago

it's just database connectivity issues with mythtv. I presumed the software would at least do some preliminary work first, but it's all done by hand. I'm sure once I sort it out it'll work.

b-rad-NDi commented 4 years ago

The way the device works is like such:

The two devices are completely separate in hardware, as they're behind a usb hub. If you look at lsusb you'll see, it's not a single 4 tuner device, it's two usb devices with the same vid:pid.

If failures only ever happen on the "first" device and never on the "second" that would seem too indicate something awry with the first device.

I'll give a look at the 8 log.

brad112358 commented 4 years ago

Overall, a hardware problem seems unlikely, but I suppose it's possible.

b-rad-NDi commented 4 years ago

There's no mxl692 debug in your 8 log, so not much to look at unfortunately.

brad112358 commented 4 years ago

That's strange. I ran the same script I used last time to enable it.

b-rad-NDi commented 4 years ago

Oddly, again, there is not a single error from the "second" em28xx device. Only the first em28xx device outputs the i2c errors. I'll produce a patch snippet and post it here for you to apply against the bridge to print out a little bit better debug. It appears the "positive" error codes are either aborted i2c read transfers or some hw status issue. It appears the bridge itself returns the "unknown error" when querying the success of the previous i2c operation. I'll have to inquire with my vendor on what the meaning of i2c transfer status code 10/14/18 are.

brad112358 commented 4 years ago

I still have the 4 tuner connected so I ran the script again and I'm seeing plenty of mxl692 functions in the log. Does the kernel have some trouble with sequential messages to the control pseudo file?

b-rad-NDi commented 4 years ago

each message overrides the previous one, to allow driver wide then disable some funcs. If you disable some funcs before enabling driver wide then it won't work. If you're just enabling output it shouldn't matter.

brad112358 commented 4 years ago

I haven't changed the script I included several comments back. Do you see a problem with it? Does the module have to be loaded before enabling debug?

b-rad-NDi commented 4 years ago

I don't see any issue and it's provided the correct logs so far. I'm not sure if it matters if the driver is loaded already, but perhaps it does.

b-rad-NDi commented 4 years ago

This prints out a tiny bit more info from the area the bridge is failing:

https://gist.github.com/b-rad-NDi/f31ebfe033fed6ae1ba75d9710a861ec

b-rad-NDi commented 4 years ago

Ok, all mythtv issues sorted and recordings set up.

brad112358 commented 4 years ago

Great! 4 scheduled to start at once seems to be more likely to fail. If it doesn't fail the first time, you can stop the backend while it's recording and when you restart it, it should start recording all 4 again. Usually a few rounds of this seems to be enough to reproduce a tuning failure for me

b-rad-NDi commented 4 years ago

Well I'm so far away from the transmitter that I'm lucky to have four channels available lol

b-rad-NDi commented 4 years ago

You only see these failures on stream start? I've been cycling the backend every minute or so for ten minutes now and have not seen a single error pop up of any sort whatsoever...I'll keep doing it though.

b-rad-NDi commented 4 years ago

dang, well, I doubt this is what you want to hear...but I've been sitting here stopping and starting the backend and watching four recordings correspondingly stop and start more than once a minute for 30 minutes now. I have not seen a single error printed at all.

I've reached out to the bridge vendor for explanation of the error codes you're seeing. They might be enlightening.

Otherwise, dang it, if there's an error I need to be able to repro it. How frustrating not to be able to :-/ C'est la vie.

b-rad-NDi commented 4 years ago

Power usage is around 0.82A when all four tuners are active. Not a lot of juice, but a decent amount of pull.

b-rad-NDi commented 4 years ago

Oops...those two buttons shouldn't be so close hehehe.

b-rad-NDi commented 4 years ago

Ok, I'll be recording four recordings at a time constantly for the next twelve hours.

Now to see if I can globally limit the size of mythtv recordings so I don't fill up it's ssd.