kerberos-io / kios

A Linux OS created by Buildroot which runs Kerberos Open Source out-of-the-box.
https://www.kerberos.io
Other
196 stars 38 forks source link

KIOS 2.7.0 on Raspberry Pi 3 , web UI not accessible (reboot needed to bring it back) #32

Closed ChieftainY2k closed 6 years ago

ChieftainY2k commented 6 years ago

Hello.

I have a fresh installation of KIOS 2.7.0 on RPI 3. From time to time there is an issue with accessing the web UI - the HTTP requests freeze then time out Tested with Firefox, Chrome, curl, wget.

That's how it looks from the KIOS shell:

[root@kios-6df342df tmp]# wget localhost
--2018-06-06 07:23:24--  http://localhost/
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 
[root@kios-6df342df tmp]# wget localhost --timeout=60
--2018-06-06 07:25:46--  http://localhost/
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... Read error (Connection timed out) in headers.
Retrying.
--2018-06-06 07:26:47--  (try: 2)  http://localhost/
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... Read error (Connection timed out) in headers.
Retrying.

.......... etc. until wget permanently fails.

In the meantime the machinery seems to be working fine:

[root@kios-6df342df tmp]# tail -f /etc/opt/kerberosio/logs/log.stash
2018-06-06 07:21:34,556 VERBOSE-1 [default] IoDisk: saving image 1528269691_6-746980_frontdoor_250-133-294-348_28_848.jpg
2018-06-06 07:21:36,748 VERBOSE-1 [default] IoVideo: end writing images
2018-06-06 07:21:36,748 VERBOSE-1 [default] IoVideo: remove videowriter
2018-06-06 07:21:36,748 VERBOSE-1 [default] IoVideo: unlocking write thread
2018-06-06 07:21:41,422 VERBOSE-1 [default] HullExpositor: activity detected from (387,388) to 

The web UI is responsive again only after KIOS is rebooted. Tom.

Responseless commented 6 years ago

I am also getting very slow web response and timeouts with the latest version. RPI 3, PICam2.1.

cedricve commented 6 years ago

I've noticed it as well last night, any help could be helpfull.

cedricve commented 6 years ago

@Responseless @ChieftainY2k can you give me your hardware details? Which pi and ethernet or wifi

Responseless commented 6 years ago

@cedricve Hey Cedric, Raspberry Pi 3 Model B Rev 1.2. Linux kios-b14a3fc8 4.14.30-v7. Running over Wifi.

cedricve commented 6 years ago

thanks, is it just the web interface that's timing out, or is it the SSH connection also?

Responseless commented 6 years ago

Just the web as far as i could tell. I was watching top and the cpu usage wasn't high for anything so it was odd. Clicking through the older recordings takes a long time to load the thumbnails/previews and thats where it stalls.

cedricve commented 6 years ago

Hmm yeah, not sure if it's caused by the raspberry/linux version. I'm creating a new release with the newest linux kernel. Do you have an idea after how many minutes the issue starts happening? or is it random?

We have a couple of issues with 2.7.0, so I marked it as prerelease / beta.

Responseless commented 6 years ago

@cedricve Not sure how long until it starts acting up but it works perfectly after a reboot. Also when you marked it as a beta just then it shows as a new version in the web. 'Good news, a new release of KiOS is available!' v2.7.0-beta ;)

cedricve commented 6 years ago

Fuck 🗡 I'll revert the name, and keep it in beta release. How long was the system running before you noticed it?

Responseless commented 6 years ago

@cedricve haha. I don't sit here and watch it all day so I couldn't really guess but I would guess hours. Mine has been up for 21 hours and has the issue. I will reboot it and see if I can notice how long it takes but no guarantees since its almost dark so there will be a lot less activity on the camera soon.

cedricve commented 6 years ago

Ok, I'll try to simulate it. Thanks for all the info @Responseless!

marcel31415 commented 6 years ago

I can't confirm any timeouts. I will try to write a script and will keep an eye on it for a couple of hours.

@Responseless @ChieftainY2k Did you both set up a brand new installation?

cedricve commented 6 years ago

@marcel31415 indeed they did.

marcel31415 commented 6 years ago
#!/bin/bash
RESPONSE=response.txt
HEADERS=headers.txt

while true
  do
    status=$(curl -s -w %{http_code} $1 -o $RESPONSE)
    now=$(date +"%T")
    if [ $status != 302 ]
    then
        echo $now $status > timeout_test.log
    fi
    echo $now $status
sleep 10
done

I am not that good in shell, but i think this should work. It will return "302" is the page is found. If not, it will send the ouput to the log file.

If you have anything better feel free to post it and improve my poor script. I will now start keeping an eye on that.

UPDATE: I think i just made it crash ;-) Maybe the curl request were to fast... so increased the "sleep"

UPDATE2: The machine does not come back online... I've got something else to test this evening. After that i will come back to this here again.

Responseless commented 6 years ago

I made it lock up but it does eventually load. It takes 1.6 minutes to return an xhr request to the date page at /images/05-06-2018 > /api/v1/images/05-06-2018/12/1/11. I do not have many images or videos to load.

Edit: Just got the 500 Internal Server Error from /api/v1/images/05-06-2018/12/1/8 . That took 2.7 minutes to die.

cedricve commented 6 years ago

Hmm, that's crazy do you have steps to reproduce?

cedricve commented 6 years ago

Ok thanks for testing @Responseless and @marcel31415 , would be great if we can confirm it doesn't happens on the 2.6.1 release.

Responseless commented 6 years ago

@cedricve I can reproduce. Reboot, log in to the website. Click on the date on the left then click half way down a busy time of the timeline. I have about 3000 images/videos.

I have a suspicion that it's $getID3->analyze() is the culprit in my testing. You added this in 2.5.1 ?

cedricve commented 6 years ago

Nope, it has been there for a couple of releases.

2018-06-06 14:12 GMT+02:00 Responseless notifications@github.com:

@cedricve https://github.com/cedricve I can reproduce. Reboot, log in to the website. Click on the date on the left then click half way down a busy time of the timeline. I have about 3000 images/videos.

I have a suspicion that it's $getID3->analyze() is the culprit in my testing. You added this in 2.5.1 ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kerberos-io/kios/issues/32#issuecomment-395046869, or mute the thread https://github.com/notifications/unsubscribe-auth/ABeaG3QW0ozqmTHq5GHNZDpcAPqMIh1cks5t58c7gaJpZM4UcIoe .

Responseless commented 6 years ago

@cedricve The getID3->analyze was the problem for me. I replaced the getMetadata with an empty object and did a basic string check of the file extension and it's instant now. The issue was with a large number of files and the metadata processing. Maybe we just didn't notice in earlier releases?

It seems when the while loop is checking metadata it locks up the whole website.

cedricve commented 6 years ago

thanks for reporting @Responseless, I do this to check if video files are valid. As it might happen that corrupt video files are shown in the browser.

cedricve commented 6 years ago

@ChieftainY2k how many files have you stored on your system?

ChieftainY2k commented 6 years ago

@cedricve

The HTTP server hangs in a pretty random fashion. Sometimes after an hour or so, sometimes after I play around with the GUI. I will try to reproduce it somehow.

cedricve commented 6 years ago

@ChieftainY2k can you temporary move the /data/machinery/capture folder to another directory, so that this directory is empty. Just wondering if like @Responseless states, it's caused by the number of images.

ChieftainY2k commented 6 years ago

@cedricve Ok, I've just removed all saved media files and rebooted, will see what happens when the machine directory fills up.

ChieftainY2k commented 6 years ago

The problem seems to be gone with the newest KIOS v2.7.1 , will report back if it happens again.

cedricve commented 6 years ago

thanks @ChieftainY2k I reverted something in the machinery + upgraded firmware