AllskyTeam / allsky

A Raspberry Pi operated Wireless Allsky Camera
MIT License
1.17k stars 181 forks source link

[BUG] ASI_ERROR_TIMEOUT (again) #3155

Closed 22878120 closed 3 months ago

22878120 commented 10 months ago

Environment

Bug Description

If you have program output or multi-line messages to include, add it like this so it formats correctly (note the lines with tildes):

Log/configuration files

OldWorking_capture.cpp.txt OldWorking_config.sh.txt

current_settings.json.txt current_config.sh.txt allsky.zip

22878120 commented 10 months ago

Update: The latest version was always failing and failed to complete a single day, so I rolled back to v2022.03.01 and it worked immediately without a single error (It also means that there are no hardware problems). This might help in understanding why the new version is having problems capturing with ZWO ASI120MC.

I will keep using the old version till it is safe to upgrade

EricClaeys commented 10 months ago

The ASI120 is very susceptible to errors, even if very little changed. Since we don't have one of those cameras we can't determine what the problem is so will likely never fix it. I do know that most people who try ALL the steps on the Troubleshooting-> ZWO Cameras page are able to solve the problem. I suspect the camera or the Pi are at the limit of USB power and unrelated changes can put them over the limit.

22878120 commented 10 months ago

The ASI120 is very susceptible to errors, even if very little changed. Since we don't have one of those cameras we can't determine what the problem is so will likely never fix it. I do know that most people who try ALL the steps on the Troubleshooting-> ZWO Cameras page are able to solve the problem. I suspect the camera or the Pi are at the limit of USB power and unrelated changes can put them over the limit.

Thanks, Eric, I will give it a try with the bigger Pi that I use for testing and will post the results here...

EricClaeys commented 10 months ago

@22878120, did you ever try v2023.05.01 ? There have been a LOT of changes since v2022.03.01. If you're willing to help me narrow down the problem I'd appreciate it. Installing v2022.03.01 (the base release, not the newer point releases) is fairly quick and you can easily revert to v2022.03.01. To do so:

# Stop Allsky
cd
mv allsky allsky-OLD
git clone --branch v2023.05.01 --recursive https://github.com/thomasjacquin/allsky.git 
cd allsky
./install.sh

If that works I can then compare that release to the point releases to see if I can find something.

22878120 commented 10 months ago

Hi Eric,

I'm happy to help. I followed instructions and installed v2023-05-01. While installing I got an error "Package # 1 of 15: [opencv_python>=4.5.3.56]"

PFA Python_dependencies.1.log Python_dependencies.1.log.txt

PS: From what I read; It may be a HASH discrepancy from the one specified in the setup

Best regards

EricClaeys commented 10 months ago

@Alex-developer, Do you remember what the fix is for the

error "Package # 1 of 15: [opencv_python>=4.5.3.56]"

was (see the attached file above)? I'd like to run a test with @22878120 using base release v2023-05-01 (no point release), and it won't install.

Eric

Alex-developer commented 10 months ago

@Alex-developer,

Do you remember what the fix is for the


error "Package # 1 of 15: [opencv_python>=4.5.3.56]"

was (see the attached file above)?

I'd like to run a test with @22878120 using base release v2023-05-01 (no point release), and it won't install.

Eric

Try removing the version number and see if it works

EricClaeys commented 9 months ago

@22878120, Can you try removing the version number from the opencv_python line in allsky/config_repo/requirements-64.txt (or 32.txt if you are on a 32-bit operating system), then reinstall? This is for the v2023.05.01 release.

22878120 commented 9 months ago

@EricClaeys

Still having troubles as screenshot below

Python_dependencies.1.log

image

EricClaeys commented 9 months ago

@Alex-developer, any other ideas? Would using a requirements file from one of the point release be better?

@22878120, would you please run

uname -a
cat /etc/os-release
22878120 commented 9 months ago

@EricClaeys

image

EricClaeys commented 9 months ago

@22878120, thanks, that Pi is running 32-bit Buster. Is this the same Pi as in your initial post which was a Pi 4B, 1 GB memory running Bullseye?

If your installation problems are on your test Pi, try installing the latest Allsky release (Point Release 4) and see if that works. If not, then try Point Release 3. Then 2, then 1, then v2023.05.01.

Is your 120 camera USB 2 or 3?

22878120 commented 9 months ago

@EricClaeys I installed the latest version #Version: v2023.05.01_04. It worked fine at the beginning, BUT the moment I adjusted the night max exposure time to 10000 I started to get ASI_ERROR_TIMEOUT

This Pi is the production one (not the test).

EricClaeys commented 9 months ago

@22878120, I assume you didn't change anything with the camera and the only change was installing v2023.05.01_04 ? If so, I wouldn't have expected it to work. Strange about working when exposure time < 10,000.

Please save all the "requirements" files in allsky/config_repo, then do a "git clone" of v2023.05.01 but before installing, copy the saved "requirements" files to allsky/config_repo. Than try installing and testing. The newer "requirements" files should get you past that installation problem.

Eric

22878120 commented 9 months ago

@EricClaeys Followed instructions but all python requirements failed!! I did manually install them using pip and removed the troubled item from requirements file, but the next item keeps failing!!

image

I had to update the following line in install.sh to install the python requirements pip3 install --no-warn-script-location "${package}" -r /tmp/package > "${L}" 2>&1

Now installation is successful... Will configure the new setup and update you

EricClaeys commented 9 months ago

@22878120 thanks for the update. Something changed after we released v2023.05.01 which causes that installation to fail. Glad you found a workaround..

Even if this version doesn't work, please save it for possible future testing. When I need to save a version I rename allsky to allsky-SAVED.

FYI, a user with an ASI120 that gave TIMOUT_ERRORS got a new camera and sent me his 120. I hope it doesn't work for me so I can troubleshoot it.

22878120 commented 9 months ago

@EricClaeys My 1st problem is the following error in the day capture (debug=4)

Jan  3 08:45:47 allskycam01 allsky[7987]: STARTING EXPOSURE at: 2024-01-03 08:45:47   @ 1.0 sec
Jan  3 08:46:05 allskycam01 allsky[7987]:   > Saving DAY image 'image-20240103084547.jpg'
Jan  3 08:46:06 allskycam01 allsky[7987]: Traceback (most recent call last):
Jan  3 08:46:06 allskycam01 allsky[7987]:   File "/home/pi/allsky/scripts/flow-runner.py", line 52, in <module>
Jan  3 08:46:06 allskycam01 allsky[7987]:     import allsky_shared as shared
Jan  3 08:46:06 allskycam01 allsky[7987]:   File "/home/pi/allsky/scripts/modules/allsky_shared.py", line 21, in <module>
Jan  3 08:46:06 allskycam01 allsky[7987]:     import board
Jan  3 08:46:06 allskycam01 allsky[7987]:   File "/home/pi/.local/lib/python3.7/site-packages/board.py", line 48, in <module>
Jan  3 08:46:06 allskycam01 allsky[7987]:     from adafruit_blinka.board.raspberrypi.raspi_4b import *
Jan  3 08:46:06 allskycam01 allsky[7987]:   File "/home/pi/.local/lib/python3.7/site-packages/adafruit_blinka/board/raspberrypi/raspi_4b.py", line 6, in <module>
Jan  3 08:46:06 allskycam01 allsky[7987]:     from adafruit_blinka.microcontroller.bcm2711 import pin
Jan  3 08:46:06 allskycam01 allsky[7987]:   File "/home/pi/.local/lib/python3.7/site-packages/adafruit_blinka/microcontroller/bcm2711/pin.py", line 5, in <module>
Jan  3 08:46:06 allskycam01 allsky[7987]:     from RPi import GPIO
Jan  3 08:46:06 allskycam01 allsky[7987]: ModuleNotFoundError: No module named 'RPi'
EricClaeys commented 9 months ago

@22878120, since we are trying to debug the TIMEOUT_ERRORS, can you disable all modules and set the Overlay Method to legacy?

The 120 I got yesterday is giving a ton of those errors so I finally have something to test with.

22878120 commented 9 months ago

@EricClaeys It is working now. Auto exposure, interval day 5 minutes, night 1 minute. Will get back to you with the log as soon as it shows the timeout error

EricClaeys commented 9 months ago

This is what I hate about the TIMEOUT errors. They seem to come and go on a whim. I think if you stand on your left foot and chant it will work :-)

My best guess is that the Pi and/or camera are on the border of some specification and minor changes in timing or voltage or temperature throw them over the line.

22878120 commented 9 months ago

My observation is that this happens at night, where exposures are longer and more frequent...

Will keep my right foot up :)

22878120 commented 9 months ago

@EricClaeys

3 ASI_ERROR_TIMEOUT happened yesterday. PFA allsky.log

allsky.log

22878120 commented 9 months ago

@EricClaeys

After running the base version for several days, I only got the following ASI_ERROR_TIMEOUT errors

> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #1 (with 0.8 exposure = YES) (11)   (January 03, 10:49:12 PM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #2 (with 0.8 exposure = YES) (11)   (January 03, 11:14:40 PM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #3 (with 0.8 exposure = YES) (11)   (January 04, 12:47:40 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #4 (with 0.8 exposure = YES) (11)   (January 05, 04:34:24 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #5 (with 0.8 exposure = YES) (11)   (January 06, 01:57:33 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #6 (with 0.8 exposure = YES) (11)   (January 06, 01:58:33 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #7 (with 0.8 exposure = YES) (11)   (January 06, 01:59:34 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #8 (with 0.8 exposure = YES) (11)   (January 06, 03:58:27 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #9 (with 0.8 exposure = YES) (11)   (January 06, 04:46:42 AM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #10 (with 0.8 exposure = YES) (11)   (January 06, 06:46:14 PM) |  
> ERROR: Failed getting image: ASI_ERROR_TIMEOUT #11 (with 0.8 exposure = YES) (11)   (January 08, 04:08:04 AM)

and it is still capturing as normal. I guess when it retries it works

I reinstalled the latest version (on same hardware) it failed immediately and reached 11 count in the 1st minute...

Any update on this error?

chop249 commented 9 months ago

@EricClaeys @Alex-developer I just downloaded and installed the latest last night for the first time. I'm seeing the same issue on a 4B with 1Gig of RAM and the ASI120MC. I'd be happy to help test and I have set it up for you to have remote access to the pi/web ui if that helps the cause.

EricClaeys commented 9 months ago

@22878120, it's working now? What release? V2023.05.01? There need to be 4 (or 5?) consecutive errors for Allsky to abort. If you have 3 then the next image works the error counter gets reset.

Let's give it several more days and if it continues to work we can try the first point release the keep installing newer ones unit it fails. I can then look at what changed.

The 120 I just got constantly gets the errors. I haven't tried an older release but I did try a different exposure algorithm which worked great.

EricClaeys commented 9 months ago

@22878120, can you let @chop249 know how you got the base release installed?

@chop249, has this camera worked with any other Allsky release?

22878120 commented 9 months ago

@EricClaeys It is working now. Version: v2023.05.01 (the base version). I'm using it in prod now and you can check it https://allskycam.mhammady.info

Just a side note. When the latest was installed and tried, the ASI_ERROR_TIMEOUT error poped up immediately. I stopped the service, renamed the allsky -> allsky-latest, and allsky-saved -> allsky. After I restarted the service, ASI_ERROR_TIMEOUT showed up immediately (with base release). I restarted the whole RPi and since then it is working fine. I guess the latest gave the camera a command or configuration that caused this issue with both versions. Just my 2 cents...

chop249 commented 9 months ago

@chop249, has this camera worked with any other Allsky release?

@EricClaeys never tried it before. Had a pi laying around, had the camera laying around, saw a youtube video a couple nights ago and decided to give it a go. Camera tests good on my Mac with the ASI software.

Also to rule out power issues I just tried a powered USB hub between the pi and the camera still getting that error and AllSky shutting gown.

EricClaeys commented 9 months ago

@chop249, please try the other things on the Troubleshooting -> ZWO cameras documentation page. It used to fix the problem for the majority of people. Thanks

EricClaeys commented 9 months ago

@22878120, @chop249, I am starting to work on what I hope will be a fix. Assuming it works with the 120 I have, I would like you guys to help test. It will entail replacing a couple files and recompiling.

@22878120, the fix will be on v2023.05.01_04 so please keep your allsky-newest.

Eric

chop249 commented 8 months ago

@EricClaeys I went through the Troubleshooting. What seems to have fixed mine was putting it on a WinDOZE box and updating the FW. Not sure why that worked, the FW was from 2013...... LOL! I used the one that said ASI120MC-compatible.iic. Been running for about an hour and a half now but it seems to miss a capture every now and then. The offer still stands to remote in to my pi for testing. I probably won't build the enclosure for a few days. So it's just pointed at the wall at the moment. I've already set up the access, would just have to PM you info.

chop249 commented 8 months ago

@EricClaeys the Gremlins are back. It ran fine all day, I moved it outside and got pics just fine. Night pics were coming in total black. I tried to adjust the gain and we went back to timeouts. Put the settings back and it's still not working. I am on 2023.05.01_4 when you're ready to start testing. Thank you!

chop249 commented 8 months ago

Something else weird is going on. I can capture daytime images just fine but when we go to night I get the ASI_TIMEOUT now. It isn't running the timelapse on the daytime images either. I assume because it has shut down overnight?

EricClaeys commented 8 months ago

@chop249, if Allsky stops because of too many consecutive errors, it won't run the end-of-night tasks like creating timelapse.

Other people have noticed that the ASI_TIMEOUTs appear when they reach a certain exposure. Do you start getting the ASI_TIMEOUTs as soon as night begins, or once the exposure time increases? Other people have seen the ASI_TIMEOUTs start when the temperature reached a certain point. For most people there's no rhyme or reason. I still think it's a bug in ZWO's Linux library since people have said it works fine on Windows.

chop249 commented 8 months ago

@EricClaeys trying to drop to the base version but I'm running into the exact same issues @22878120 had with python opencv. I tried installing opencv but it is still giving me the same issue with install. Wondering which line was deleted in the install.sh, modifying pip3 install --no-warn-script-location "${package}" -r /tmp/package > "${L}" 2>&1 alone is not getting it.

EricClaeys commented 8 months ago

@22878120, can you let @chop249 know how you got the base release installed?

22878120 commented 8 months ago

@chop249 The fix is here https://github.com/AllskyTeam/allsky/issues/3155#issuecomment-1874449998

@EricClaeys Any ETA for the fix you want us to test?

EricClaeys commented 8 months ago

No ETA but probably a couple weeks at least.

chop249 commented 8 months ago

So I did a fresh install. I believe I have opencv installed. 1st pic is with no mod, 2nd is prior to modding install.sh, 3rd is with modification and 4th is install attempt post mod. 5th is contents of the log. What am I missing?

Screenshot 2024-01-16 at 14 50 01 Screenshot 2024-01-16 at 14 51 12 Screenshot 2024-01-16 at 14 52 30 Screenshot 2024-01-16 at 14 55 29 Screenshot 2024-01-16 at 14 56 41

EricClaeys commented 8 months ago

@chop249 , at operating system are you on? Bookworm?

chop249 commented 8 months ago

@EricClaeys yes Bookworm, 64bit and no desktop.

EricClaeys commented 8 months ago

@chop249, Point Release 4 is the only release that works on Bookworm so we'll need to do brain surgery. Rename your current allsky to allsky-SAVED then do a normal "git clone" if you don't already have Point Release 4 installed. Then:

mv allsky/src allsky/src-PR4
cp -Ra allsky-SAVED/src allsky
cd allsky
./install.sh

This will use the Point Release 4 Allsky EXCEPT for the capture program which will be from the base release. To be honest I don't know if this will work but most likely it will. Assuming it does and you don't get the ASI_ERROR_TIMEOUT's I'd like to try using the Point Release 3 capture program and if it doesn't work the Point Release 2 then 1. This will tell me exactly which release started giving the timeouts which will hopefully give me an idea what to look for.

chop249 commented 8 months ago

@EricClaeys PROGRESS! So it seems to be working set up in my living room with the lights turned off and lights on in another room with the door cracked. There are a LOT of hot pixels but I haven't done darks to combat that. A couple things, it is throwing the attached errors and I used the settings from the 1st _4 install that had been moved to allsky-OLD. Not sure if that's the culprit for these errors and the night interval said 50 seconds initially with 30000 as the setting. Dialed back to 10000 interval and it is now at 30 seconds. I'll do an outdoor test starting in the am. Don't feel like going out in the 19*F temps right now. LOL! If you think we should blow off the allsky-OLD settings and just start from scratch I can do that as well. Screenshot 2024-01-16 at 22 36 09

EricClaeys commented 8 months ago

@chop249, GREAT!! One of the point releases split the meanthreshold setting into day and night values. You can edit allsky/config/settings.json to remove the daymeanthreshold and nightmeanthreshold lines, then make sure you set the Mean Threshold in the WebUI to 0.024.

The ASI120 is known for hot pixels.

Let's run like this for a a while until we are convinced it's working fine, then put the Point Release 3 capture program in place and see how it does.

chop249 commented 8 months ago

@EricClaeys day and night 1 down. It threw 1 timeout about 18:40 yesterday and did not do the star trails pic. Is that because I am capturing both night and day pictures? Other than that I need some fine tuning to the focus and dark frames. I'll look and see how to get it on my website and I'll post a link later today. Let me know how long you want to let it test in this state before trying other things. Saturday I know I'm pretty busy schedule wise, outside of that I'm fairly flexible again till the 27th.

22878120 commented 8 months ago

@EricClaeys The Base version failed again. Waiting for the new code to be tested...

image allsky.log.zip

EricClaeys commented 8 months ago

@chop249 an occasional message is fine.

Did keogram and timelapse work? When you say startrails didn't work do you mean the picture doesn't have any trails, or no picture was created?

chop249 commented 8 months ago

@EricClaeys sorry I should have been clearer. All three produced their respective output. The star trails image I have attached. startrails-20240117

chop249 commented 8 months ago

Sample of a night image. image-20240117231440

EricClaeys commented 8 months ago

You'll need to play with the threshold. There's a documentation page that gives some ideas. The black image is 100 ms which is 1/10 sec which may have been a day image ?