aaronwmorris / indi-allsky

Software to manage a Linux-based All Sky Camera.
GNU General Public License v3.0
248 stars 41 forks source link

RPi4 8GB, Raspi OS Bullseye 64bit, RPi HQ Camera - issues - indi-allsky keeps stopping by itself as well as when reloading config from system->services. Also focus mode not working #593

Closed baryah closed 1 year ago

baryah commented 1 year ago

I happened to be using Thomas Jacquin Allsky from last couple of years and recently gave the indi-allsky a try. I have had following issues with my configuration -

  1. I had to choose no GPS during installation to be able to set the latitude and longitude manually (i don't have any GPS module on my RPi).
  2. Most of the times when I make changes to the configuration from the webpage and choose to Reload on save, the indi-allsky service crashes.
  3. When I enable the Focus mode in the configurations, I constantly get 'No image for 5 minutes' and no image is shot.
  4. The indi-allsky service keeps on stopping by itself randomly.
  5. Another Issue I am having is that the web interface shows allsky service to be 'DOWN' while the service is running; even the system->services page shows it to be active.

During installation I chose the libcamera and the imx447.

indi-allsky.log

Screenshot 2023-02-28 at 12 08 32 AM

indi-allsky-focus-mode-on.log

aaronwmorris commented 1 year ago

I had to choose no GPS during installation to be able to set the latitude and longitude manually (i don't have any GPS module on my RPi).

This is expected. Do not select GPS support if you do not have a GPS adapter.

Most of the times when I make changes to the configuration from the webpage and choose to Reload on save, the indi-allsky service crashes.

Investigating...

When I enable the Focus mode in the configurations, I constantly get 'No image for 5 minutes' and no image is shot.

Investigating...

The indi-allsky service keeps on stopping by itself randomly.

I see the sqlite database is being locked, which does not make any sense. How recently did you pull down this code?

Another Issue I am having is that the web interface shows allsky service to be 'DOWN' while the service is running; even the system->services page shows it to be active.

This could be a timing issue. It could take 1-2 minutes before the DOWN message show RUNNING, but I am investigating.

aaronwmorris commented 1 year ago

Regarding focus mode. If focus mode is enabled, due to a recent change, it would only show the image in the Focus Mode view. I merged #595 to show the image on the Latest Image view even when focus mode is enabled.

aaronwmorris commented 1 year ago

Regarding the crashes due to the database being locked, it is acting like the database journal is not in WAL mode. Can you show me the output of the following?

sqlite3 /var/lib/indi-allsky/indi-allsky.sqlite "PRAGMA journal_mode;"
baryah commented 1 year ago
Screenshot 2023-02-28 at 4 14 53 PM
baryah commented 1 year ago

i downloaded it yesterday only

I had to choose no GPS during installation to be able to set the latitude and longitude manually (i don't have any GPS module on my RPi).

This is expected. Do not select GPS support if you do not have a GPS adapter.

Most of the times when I make changes to the configuration from the webpage and choose to Reload on save, the indi-allsky service crashes.

Investigating...

When I enable the Focus mode in the configurations, I constantly get 'No image for 5 minutes' and no image is shot.

Investigating...

The indi-allsky service keeps on stopping by itself randomly.

I see the sqlite database is being locked, which does not make any sense. How recently did you pull down this code?

Another Issue I am having is that the web interface shows allsky service to be 'DOWN' while the service is running; even the system->services page shows it to be active.

This could be a timing issue. It could take 1-2 minutes before the DOWN message show RUNNING, but I am investigating.

I pulled down the code yesterday

In focus mode, from the log it is clear that the pictures are being captured, but I cannot see them in either the 'Latest' pane which shows "No image for 5 minutes" and the "Focus" pane shows the last image captured during capture mode

aaronwmorris commented 1 year ago

Issues 2, 3, and 5 (possibly 4) all point to issues with the database. With the database journal in WAL mode, it should not be locking since this allows multiple readers and writers.

In the past, I have experienced the database locking issue which is why the WAL journal is utilized.

Is this a new SDcard you are using? I do not like to cavalierly blame the card, but bad storage is the only thing makes sense to me.

baryah commented 1 year ago

I ll change the card and see how it behaves

baryah commented 1 year ago

Issues 2, 3, and 5 (possibly 4) all point to issues with the database. With the database journal in WAL mode, it should not be locking since this allows multiple readers and writers.

In the past, I have experienced the database locking issue which is why the WAL journal is utilized.

Is this a new SDcard you are using? I do not like to cavalierly blame the card, but bad storage is the only thing makes sense to me.

Hard to believe, but changing the SD card actually did the trick. Everything is working fine now. Thanks. a lot for your guidance. I couldn't have ever guessed this to be the issue.

baryah commented 1 year ago

Still after running perfectly for sometime, the status becomes 'Down' while the services are active.

indi-allsky (2).log

baryah commented 1 year ago

indi-allsky.log

aaronwmorris commented 1 year ago

The database is experiencing locks for some reason.

Every 15 seconds or so indi-allsky updates the timestamp of the WATCHDOG state in the database. If that timestamp exceeds a certain threshold, it is assumed indi-allsky is not running.

Mar  2 23:13:41 indi-allsky [ERROR] MainProcess allsky.unhandled_exception() #34: An uncaught exception occurred:
Mar  2 23:13:41 indi-allsky [ERROR] MainProcess allsky.unhandled_exception() #35: Type: <class 'sqlalchemy.exc.OperationalError'>
Mar  2 23:13:41 indi-allsky [ERROR] MainProcess allsky.unhandled_exception() #36: Value: (sqlite3.OperationalError) database is locked#012[SQL: UPDATE state SET "createDate"=?, value=? WHERE state."key" = ?]#012[parameters: ('2023-03-02 23:13:30.760751', '1677779010', 'WATCHDOG')]#012(Background on this error at: https://sqlalche.me/e/20/e3q8)

The bizarre thing is that this only seems to be affecting updating the WATCHDOG database change. There are other database transactions still running all the time. The only thing this would really affect is the Status message, it does not affect any other functions.

You can try to flip over to mysql/mariadb if you want. https://github.com/aaronwmorris/indi-allsky/wiki/MySQL-MariaDB-Information

aaronwmorris commented 1 year ago

Addendum: Something is still suspect with the sdcard. You may want to see if the following wiki page helps to reduce disk IO on your system:

https://github.com/aaronwmorris/indi-allsky/wiki/Disk-Optimizations

aaronwmorris commented 1 year ago

In light of the report of database locking in #502 , I guess there is an underlying issue here. I think its journald causing the disk I/O to slow down, but I think I found a solution to that.

Merged #601 to extend the I/O timeout for sqlite.

baryah commented 1 year ago

Thanks, Will update

On 04-Mar-2023, at 10:25 PM, Aaron W Morris @.***> wrote:

Merged #601 https://github.com/aaronwmorris/indi-allsky/pull/601 to extend the I/O timeout for sqlite.

baryah commented 1 year ago

Addendum: Something is still suspect with the sdcard. You may want to see if the following wiki page helps to reduce disk IO on your system:

https://github.com/aaronwmorris/indi-allsky/wiki/Disk-Optimizations

Did these optimisations and installed the latest update - Its been one whole night without a single database locking ..

baryah commented 1 year ago

I have had no database locking since yesterday night. Today I tried to generate the darks, I got -

(indi-allsky) pi@indi-allsky:~/indi-allsky $ ./darks.py sigmaclip
2023-03-05 14:15:34,118 [INFO] MainProcess indi.__init__() #135: creating an instance of IndiClient
2023-03-05 14:15:34,119 [INFO] MainProcess indi.__init__() #142: PyIndi version: 1.9.7
2023-03-05 14:15:34,120 [INFO] MainProcess darks._initialize() #182: Connecting to indiserver
INDI::BaseClient::connectServer: creating new connection...
INDI::BaseClient::connectServer: Already connected.
2023-03-05 14:15:34,120 [INFO] MainProcess indi.serverConnected() #328: Server connected (localhost:7624)
2023-03-05 14:15:34,126 [INFO] MainProcess indi.newDevice() #198: new device CCD Simulator
2023-03-05 14:15:34,136 [INFO] MainProcess indi.newDevice() #198: new device Telescope Simulator
2023-03-05 14:15:42,129 [WARNING] MainProcess darks._initialize() #205: Connecting to device libcamera_imx477
INDI::BaseClient: Error. Unable to find driver libcamera_imx477
2023-03-05 14:15:42,190 [INFO] MainProcess miscDb.addCamera() #71: Camera DB ID: 2
2023-03-05 14:15:42,192 [INFO] MainProcess indi.updateCcdBlobMode() #390: Set BLOB mode
2023-03-05 14:15:42,192 [ERROR] MainProcess darks._initialize() #254: CCD night gain above maximum, changing to 16
2023-03-05 14:15:45,196 [ERROR] MainProcess darks._initialize() #263: CCD moon mode gain above maximum, changing to 16
2023-03-05 14:15:48,199 [ERROR] MainProcess darks._initialize() #268: CCD day gain below minimum, changing to 1
2023-03-05 14:15:51,203 [WARNING] MainProcess darks._run() #572: ****** IF THE CCD COOLER WAS ENABLED, YOU MAY CONSIDER STOPPING THIS UNTIL THE SENSOR HAS WARMED ******
2023-03-05 14:15:59,214 [INFO] MainProcess darks._take_exposures() #648: Temp folder: /tmp/tmpv7iack15
2023-03-05 14:15:59,215 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:15:59,217 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpxprjygvm.json --metadata-format json --output /tmp/tmpim1laxxc.dng
2023-03-05 14:17:00,655 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4402 s
2023-03-05 14:17:00,995 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.88
2023-03-05 14:17:00,996 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:17:00,997 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpfawidsq1.json --metadata-format json --output /tmp/tmp16wrbm5j.dng
2023-03-05 14:18:02,376 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3799 s
2023-03-05 14:18:02,724 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.96
2023-03-05 14:18:02,724 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:18:02,725 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpjmwa8902.json --metadata-format json --output /tmp/tmp__clhl51.dng
2023-03-05 14:19:04,094 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3697 s
2023-03-05 14:19:04,438 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.92
2023-03-05 14:19:04,439 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:19:04,440 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpbq74wsez.json --metadata-format json --output /tmp/tmpd38g30aq.dng
2023-03-05 14:20:05,809 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3703 s
2023-03-05 14:20:06,128 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.91
2023-03-05 14:20:06,128 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:20:06,129 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpn0zen3u2.json --metadata-format json --output /tmp/tmpsssjrdl1.dng
2023-03-05 14:21:07,505 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3769 s
2023-03-05 14:21:07,827 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.94
2023-03-05 14:21:07,828 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:21:07,829 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmplvn3i0aa.json --metadata-format json --output /tmp/tmp_6llz7c5.dng
2023-03-05 14:22:09,247 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4191 s
2023-03-05 14:22:09,597 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.93
2023-03-05 14:22:09,597 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:22:09,598 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpgggm78b9.json --metadata-format json --output /tmp/tmpnqbqcx1y.dng
2023-03-05 14:23:10,965 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3682 s
2023-03-05 14:23:11,316 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.95
2023-03-05 14:23:11,316 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:23:11,317 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmp_2vlz9ar.json --metadata-format json --output /tmp/tmpshqkk7le.dng
2023-03-05 14:24:12,745 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4284 s
2023-03-05 14:24:13,088 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.00
2023-03-05 14:24:13,089 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1)
2023-03-05 14:24:13,090 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpp63tfqpf.json --metadata-format json --output /tmp/tmpi9poal6e.dng
2023-03-05 14:25:14,465 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3763 s
Traceback (most recent call last):
  File "/home/pi/indi-allsky/./darks.py", line 90, in <module>
    action_func()
  File "/home/pi/indi-allsky/indi_allsky/darks.py", line 418, in sigmaclip
    self._sigmaclip()
  File "/home/pi/indi-allsky/indi_allsky/darks.py", line 425, in _sigmaclip
    self._run(IndiAllSkyDarksSigmaClip)
  File "/home/pi/indi-allsky/indi_allsky/darks.py", line 584, in _run
    self._take_exposures(exposure, dark_filename_t, bpm_filename_t, ccd_bits, stacking_class)
  File "/home/pi/indi-allsky/indi_allsky/darks.py", line 667, in _take_exposures
    hdulist = self._wait_for_image(exposure_f)
  File "/home/pi/indi-allsky/indi_allsky/darks.py", line 330, in _wait_for_image
    raw = rawpy.imread(str(filename_p))
  File "/home/pi/indi-allsky/virtualenv/indi-allsky/lib/python3.9/site-packages/rawpy/__init__.py", line 20, in imread
    d.open_file(pathOrFile)
  File "rawpy/_rawpy.pyx", line 409, in rawpy._rawpy.RawPy.open_file
  File "rawpy/_rawpy.pyx", line 936, in rawpy._rawpy.RawPy.handle_error
rawpy._rawpy.LibRawIOError: b'Input/output error'
BaseClient::~BaseClient: Probability of detecting a deadlock.
baryah commented 1 year ago

Addendum: Something is still suspect with the sdcard. You may want to see if the following wiki page helps to reduce disk IO on your system: https://github.com/aaronwmorris/indi-allsky/wiki/Disk-Optimizations

Did these optimisations and installed the latest update - Its been one whole night without a single database locking ..

Update on the issue of database locking --

After more than 36 hours of flawless running - the status again became 'Down'. When I checked the status in the terminal with 'systemctl --user status indi-allsky', I got the message that the storage space is all full. I flushed the images and timelapses and restarted the service. it again started working but now the database episodes are back -

log is as follows

allsky_log.txt

StarGeezerPhil commented 1 year ago

Just to add to this, rather than in the other discussion.

sqlite3 /var/lib/indi-allsky/indi-allsky.sqlite "PRAGMA journal_mode;"

Also reports I'm in WALmode.

I'm having the same random service outage/down time after months/year of stable running - since installing the update in the last week/two.

aaronwmorris commented 1 year ago

Thanks, I'm investigating.

The service is not actually going down, it's just the timestamp of the watchdog is not being updated.

aaronwmorris commented 1 year ago

Merged #603

I found another SQLite PRAGMA that might resolve this issue. https://www.sqlite.org/pragma.html#pragma_synchronous This setting is recommended to be set to NORMAL when the journal is in WAL mode. The default is FULL, which performs a filesystem sync after every commit.

I have elected to set synchronous to OFF and just let the OS perform its own syncs. This should be advantageous for SDcards, although there is a slight chance of DB corruption of power is lost.

aaronwmorris commented 1 year ago

This resulted in a nice reduction in I/O on my system. I am going to run some tests with NORMAL, well.

Screenshot_20230305_211830

StarGeezerPhil commented 1 year ago

Thanks for the update Aaron. Let us know if we can apply changes or you're releasing an update 👍

aaronwmorris commented 1 year ago

Go ahead and apply the update. I did not run under NORMAL mode for long, it has just about as much I/O as full mode, although it could have resolved the issue as well.

I appreciate everyone's patience with these issues. I am currently working on isolating the functionality of the web interface from the main indi-allsky process so that they can run on different devices/servers. When everything is local, checking if a process is running, you can easily check for the presence of a process ID, but if they are separate, the database is the best way to communicate.

I do not fully understand why my recent changes are causing so much more I/O. Updating a database table for the WATCHDOG timestamp every 15 seconds is not exactly excessive. I suspect my recent updates regarding establishing separate database contexts/connections across the multiple processes caused more filesystem syncing to occur. Previously, I had a single shared context across all of the processes.

baryah commented 1 year ago

Thanks Aaron.

If I got it right, adding " dbapi_con.execute('PRAGMA synchronous=OFF') " to " ~/indi-allsky/indi_allsky/flask/init.py " under " def _sqlite_pragma_on_connect(dbapi_con, con_record): " should do the trick.

What would be the symptom of corrupted database. I had a power loss while the indi-allsky was running. After that I was not able to have a look at the "Images" - Clicking on 'Images' got me to an error describing page that the page could not be loaded due to something something.

aaronwmorris commented 1 year ago

I would have to see the error.

If you were editing the python files, the gunicorn process might have crashed if there was a typo at any point. You can try to restart the gunicorn process with systemctl --user restart gunicorn-indi-allsky

The log for the gunicorn is at /var/log/indi-allsky/webapp-indi-allsky.log

baryah commented 1 year ago

I would have to see the error.

If you were editing the python files, the gunicorn process might have crashed if there was a typo at any point. You can try to restart the gunicorn process with systemctl --user restart gunicorn-indi-allsky

The log for the gunicorn is at /var/log/indi-allsky/webapp-indi-allsky.log

webapp-indi-allsky.log

StarGeezerPhil commented 1 year ago

I'm not familiar with your scripts, so not confident making any edits.

Watching for any breakthroughs here...

aaronwmorris commented 1 year ago

I think I might have been too cavalier by setting synchronous to OFF. I am going to set this to NORMAL and have you guys test this. None of my cameras are exhibiting the DB locking. I am reasonably confident, based on what I read that NORMAL should resolve the locking as well.

aaronwmorris commented 1 year ago

@baryah You had a typo in your edit. dbasi_con should have been dbapi_con

dbasi_con.execute('PRAGMA synchronous=0')#012NameError: name 'dbasi_con' is not defined

baryah commented 1 year ago

@baryah You had a typo in your edit. dbasi_con should have been dbapi_con

dbasi_con.execute('PRAGMA synchronous=0')#012NameError: name 'dbasi_con' is not defined

yes I corrected it

aaronwmorris commented 1 year ago

Merged #606 for the image viewer when RAW export files are placed outside the document root.

baryah commented 1 year ago

Merged #606 for the image viewer when RAW export files are placed outside the document root.

in fact I had changed the folders for both the 'images' and the 'raw images' and nothing was displayed, though log showed that frames are being captured. - so too many changes done together and I was not sure which one is causing the issue

StarGeezerPhil commented 1 year ago

@aaronwmorris please let me know what changes to make/where if you'd like us to test (for the synchronous db settings).

aaronwmorris commented 1 year ago

I have already committed the change and merge it. Just pull down the latest code and restart.

baryah commented 1 year ago
> I have had no database locking since yesterday night. Today I tried to generate the darks, I got -
> 
> (indi-allsky) pi@indi-allsky:~/indi-allsky $ ./darks.py sigmaclip 2023-03-05 14:15:34,118 [INFO] MainProcess indi.**init**() #135: creating an instance of IndiClient 2023-03-05 14:15:34,119 [INFO] MainProcess indi.**init**() #142: PyIndi version: 1.9.7 2023-03-05 14:15:34,120 [INFO] MainProcess darks._initialize() #182: Connecting to indiserver INDI::BaseClient::connectServer: creating new connection... INDI::BaseClient::connectServer: Already connected. 2023-03-05 14:15:34,120 [INFO] MainProcess indi.serverConnected() #328: Server connected (localhost:7624) 2023-03-05 14:15:34,126 [INFO] MainProcess indi.newDevice() #198: new device CCD Simulator 2023-03-05 14:15:34,136 [INFO] MainProcess indi.newDevice() #198: new device Telescope Simulator 2023-03-05 14:15:42,129 [WARNING] MainProcess darks._initialize() #205: Connecting to device libcamera_imx477 INDI::BaseClient: Error. Unable to find driver libcamera_imx477 2023-03-05 14:15:42,190 [INFO] MainProcess miscDb.addCamera() #71: Camera DB ID: 2 2023-03-05 14:15:42,192 [INFO] MainProcess indi.updateCcdBlobMode() #390: Set BLOB mode 2023-03-05 14:15:42,192 [ERROR] MainProcess darks._initialize() #254: CCD night gain above maximum, changing to 16 2023-03-05 14:15:45,196 [ERROR] MainProcess darks._initialize() #263: CCD moon mode gain above maximum, changing to 16 2023-03-05 14:15:48,199 [ERROR] MainProcess darks._initialize() #268: CCD day gain below minimum, changing to 1 2023-03-05 14:15:51,203 [WARNING] MainProcess darks._run() #572: ****** IF THE CCD COOLER WAS ENABLED, YOU MAY CONSIDER STOPPING THIS UNTIL THE SENSOR HAS WARMED ****** 2023-03-05 14:15:59,214 [INFO] MainProcess darks._take_exposures() #648: Temp folder: /tmp/tmpv7iack15 2023-03-05 14:15:59,215 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:15:59,217 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpxprjygvm.json --metadata-format json --output /tmp/tmpim1laxxc.dng 2023-03-05 14:17:00,655 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4402 s 2023-03-05 14:17:00,995 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.88 2023-03-05 14:17:00,996 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:17:00,997 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpfawidsq1.json --metadata-format json --output /tmp/tmp16wrbm5j.dng 2023-03-05 14:18:02,376 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3799 s 2023-03-05 14:18:02,724 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.96 2023-03-05 14:18:02,724 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:18:02,725 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpjmwa8902.json --metadata-format json --output /tmp/tmp__clhl51.dng 2023-03-05 14:19:04,094 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3697 s 2023-03-05 14:19:04,438 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.92 2023-03-05 14:19:04,439 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:19:04,440 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpbq74wsez.json --metadata-format json --output /tmp/tmpd38g30aq.dng 2023-03-05 14:20:05,809 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3703 s 2023-03-05 14:20:06,128 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.91 2023-03-05 14:20:06,128 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:20:06,129 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpn0zen3u2.json --metadata-format json --output /tmp/tmpsssjrdl1.dng 2023-03-05 14:21:07,505 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3769 s 2023-03-05 14:21:07,827 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.94 2023-03-05 14:21:07,828 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:21:07,829 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmplvn3i0aa.json --metadata-format json --output /tmp/tmp_6llz7c5.dng 2023-03-05 14:22:09,247 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4191 s 2023-03-05 14:22:09,597 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 256.93 2023-03-05 14:22:09,597 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:22:09,598 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpgggm78b9.json --metadata-format json --output /tmp/tmpnqbqcx1y.dng 2023-03-05 14:23:10,965 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3682 s 2023-03-05 14:23:11,316 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.95 2023-03-05 14:23:11,316 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:23:11,317 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmp_2vlz9ar.json --metadata-format json --output /tmp/tmpshqkk7le.dng 2023-03-05 14:24:12,745 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.4284 s 2023-03-05 14:24:13,088 [INFO] MainProcess darks._take_exposures() #680: Image average adu: 257.00 2023-03-05 14:24:13,089 [INFO] MainProcess darks.shoot() #278: Taking 60.00000000 s exposure (gain 1) 2023-03-05 14:24:13,090 [INFO] MainProcess libcamera.setCcdExposure() #181: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 1 --shutter 60000000 --metadata /tmp/tmpp63tfqpf.json --metadata-format json --output /tmp/tmpi9poal6e.dng 2023-03-05 14:25:14,465 [INFO] MainProcess darks._take_exposures() #664: Exposure received in 61.3763 s Traceback (most recent call last): File "/home/pi/indi-allsky/./darks.py", line 90, in action_func() File "/home/pi/indi-allsky/indi_allsky/darks.py", line 418, in sigmaclip self._sigmaclip() File "/home/pi/indi-allsky/indi_allsky/darks.py", line 425, in _sigmaclip self._run(IndiAllSkyDarksSigmaClip) File "/home/pi/indi-allsky/indi_allsky/darks.py", line 584, in _run self._take_exposures(exposure, dark_filename_t, bpm_filename_t, ccd_bits, stacking_class) File "/home/pi/indi-allsky/indi_allsky/darks.py", line 667, in _take_exposures hdulist = self._wait_for_image(exposure_f) File "/home/pi/indi-allsky/indi_allsky/darks.py", line 330, in _wait_for_image raw = rawpy.imread(str(filename_p)) File "/home/pi/indi-allsky/virtualenv/indi-allsky/lib/python3.9/site-packages/rawpy/**init**.py", line 20, in imread d.open_file(pathOrFile) File "rawpy/_rawpy.pyx", line 409, in rawpy._rawpy.RawPy.open_file File "rawpy/_rawpy.pyx", line 936, in rawpy._rawpy.RawPy.handle_error rawpy._rawpy.LibRawIOError: b'Input/output error' BaseClient::~BaseClient: Probability of detecting a deadlock.

Not able to make out why am I not able to generate darks??

aaronwmorris commented 1 year ago

Are you running out of space?

baryah commented 1 year ago

No have plenty of space - some 90GB or so

On 08-Mar-2023, at 9:56 PM, Aaron W Morris @.***> wrote:

Are you running out of space?

— Reply to this email directly, view it on GitHub https://github.com/aaronwmorris/indi-allsky/issues/593#issuecomment-1460455231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKQZSXVRHFYERT7WFHMJHSDW3CXKDANCNFSM6AAAAAAVJXGWQI. You are receiving this because you were mentioned.

aaronwmorris commented 1 year ago

I am reviewing my code and I already catch conditions if the image file is missing or if the size is 0.

I can only assume the file is either corrupt or the rawpy module does not like the format. Can you try reseating your CSI cable in both the raspi and the camera module?

baryah commented 1 year ago

I am reviewing my code and I already catch conditions if the image file is missing or if the size is 0.

I can only assume the file is either corrupt or the rawpy module does not like the format. Can you try reseating your CSI cable in both the raspi and the camera module?

Done. In fact used a different RPi 4b 8GB and a different RPi HQ Camera It did not help - getting the same error

StarGeezerPhil commented 1 year ago

Hi @aaronwmorris, despite a promising start after the update I'm afraid it was still going down.

Just checked and it appears there's another updated package - installed it, but I'm now getting Internal Server Error.

About to try a cold power off/on cycle. UPDATE: no success.

aaronwmorris commented 1 year ago

What is the exception in /var/log/indi-allsky/webapp-indi-allsky.log?

StarGeezerPhil commented 1 year ago

Sorry - just doing the update again (I didn't replace INDI for the first time that last time - don't know if that's likely to have made any difference).

StarGeezerPhil commented 1 year ago

Same after trying the update again.

Exception from:/var/log/indi-allsky/webapp-indi-allsky.log as below:

Mar  9 21:44:53 SkyCam [ERROR] MainProcess app.log_exception() #1741: Exception on /indi-allsky/ [GET]#012Traceback (most recent call last):#012  File "/home/StarGeezer/indi-allsky/virtualenv/indi-allsky/lib/python3.9//site-packages/sqlalchemy/engine/result.py", line 614, in _only_one_row#012    raise exc.MultipleResultsFound(#012sqlalchemy.exc.MultipleResultsFound: Multiple rows were found when exactly one was required
aaronwmorris commented 1 year ago

I will need a little more of that exception to see where it was generated.

aaronwmorris commented 1 year ago

I think I found it. One of my test cameras is generating the exception.

Fix included in #619

StarGeezerPhil commented 1 year ago

Awesome, thanks Aaron. Updated and the webUI is back online.

However, new imageworker exception:

Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception: 
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception: KeyError: 'CCD_INFO'
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:     elif self.config['CCD_INFO']['CCD_CFA']['CFA_TYPE'].get('text'):
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:   File "/home/pi/indi-allsky/indi_allsky/image.py", line 1570, in add
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:     self.image_processor.add(filename_p, exposure, exp_date, exp_elapsed, camera)
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:   File "/home/pi/indi-allsky/indi_allsky/image.py", line 288, in processImage
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:     self.processImage(i_dict)
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:   File "/home/pi/indi-allsky/indi_allsky/image.py", line 224, in saferun
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:     self.saferun()
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception:   File "/home/pi/indi-allsky/indi_allsky/image.py", line 195, in run
Mar 10 00:23:50 SkyCam [ERROR] MainProcess allsky._startImageWorker() #734: Image worker exception: Traceback (most recent call last):
Mar 10 00:23:50 SkyCam [INFO] MainProcess allsky.run() #997: Exposure state: READY
Mar 10 00:23:50 SkyCam [INFO] MainProcess allsky.run() #996: Camera last ready: 9.5s
Mar 10 00:23:40 SkyCam [INFO] MainProcess allsky.run() #1142: Exposure received in 1.4525 s (1.4515)
Mar 10 00:23:39 SkyCam [INFO] MainProcess allsky.run() #1211: Total time since last exposure 45.1407 s
Mar 10 00:23:39 SkyCam [INFO] MainProcess libcamera.setCcdExposure() #192: image command: libcamera-still --immediate --nopreview --raw --denoise off --awbgains 1,1 --gain 16 --shutter 1000 --metadata /tmp/tmpa3iadv4d.json --metadata-format json --output /tmp/tmpsajhurw5.dng
aaronwmorris commented 1 year ago

Merged #620 to pass the CFA pattern between processes. Unfortunately, this required a database change so it will be necessary to re-run setup.sh.

baryah commented 1 year ago

Hi Aaron. I am still not able to capture Darks. Previously, the error would arise after around 9 to 10 exposures, now it is happening after the first one - Timeout

  INDI::BaseClient: Error. Unable to find driver libcamera_imx477
  2023-03-10 13:06:50,223 [INFO] MainProcess miscDb.addCamera() #80: Camera DB ID: 1
  2023-03-10 13:06:50,223 [INFO] MainProcess indi.updateCcdBlobMode() #400: Set BLOB mode
  2023-03-10 13:06:50,224 [ERROR] MainProcess darks._initialize() #282: CCD day gain below minimum, changing to 1
  2023-03-10 13:06:53,227 [WARNING] MainProcess darks._run() #586: ****** IF THE CCD COOLER WAS ENABLED, YOU MAY CONSIDER STOPPING THIS UNTIL THE SENSOR HAS WARMED ******
  2023-03-10 13:07:01,239 [INFO] MainProcess darks._take_exposures() #662: Temp folder: /tmp/tmpbhab4xsy
  2023-03-10 13:07:01,240 [INFO] MainProcess darks.shoot() #292: Taking 15.00000000 s exposure (gain 1)
  2023-03-10 13:07:01,242 [INFO] MainProcess libcamera.setCcdExposure() #192: image command: libcamera-still --immediate --nopreview --encoding jpg --quality 100 --gain 1 --shutter 15000000 --metadata /tmp/tmpe7tk3vu6.json --metadata-format json --output /tmp/tmptt7bydh5.jpg
  2023-03-10 13:07:41,283 [ERROR] MainProcess libcamera.setCcdExposure() #209: Exposure timeout
  Traceback (most recent call last):
    File "/home/pi/indi-allsky/indi_allsky/camera/libcamera.py", line 207, in setCcdExposure
      self.libcamera_process.wait(timeout=timeout)
    File "/usr/lib/python3.9/subprocess.py", line 1189, in wait
      return self._wait(timeout=timeout)
    File "/usr/lib/python3.9/subprocess.py", line 1911, in _wait
      raise TimeoutExpired(self.args, timeout)
  subprocess.TimeoutExpired: Command '['libcamera-still', '--immediate', '--nopreview', '--encoding', 'jpg', '--quality', '100', '--gain', '1', '--shutter', '15000000', '--metadata', '/tmp/tmpe7tk3vu6.json', '--metadata-format', 'json', '--output', '/tmp/tmptt7bydh5.jpg']' timed out after 40.0 seconds

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/home/pi/indi-allsky/./darks.py", line 90, in <module>
      action_func()
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 432, in sigmaclip
      self._sigmaclip()
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 439, in _sigmaclip
      self._run(IndiAllSkyDarksSigmaClip)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 598, in _run
      self._take_exposures(exposure, dark_filename_t, bpm_filename_t, ccd_bits, stacking_class)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 674, in _take_exposures
      self.shoot(exposure_f, sync=True, timeout=timeout)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 294, in shoot
      self.indiclient.setCcdExposure(exposure, sync=sync, timeout=timeout)
    File "/home/pi/indi-allsky/indi_allsky/camera/libcamera.py", line 210, in setCcdExposure
      raise TimeOutException('Timeout waiting for exposure')
  indi_allsky.exceptions.TimeOutException: Timeout waiting for exposure
  BaseClient::~BaseClient: Probability of detecting a deadlock.
StarGeezerPhil commented 1 year ago

Merged #620 to pass the CFA pattern between processes. Unfortunately, this required a database change so it will be necessary to re-run setup.sh.

Thanks @aaronwmorris, just to confirm this has worked for me. I'll let you know if it stays up or if it continues to go down.

baryah commented 1 year ago

Hi Aaron. I am still not able to capture Darks. Previously, the error would arise after around 9 to 10 exposures, now it is happening after the first one - Timeout

  INDI::BaseClient: Error. Unable to find driver libcamera_imx477
  2023-03-10 13:06:50,223 [INFO] MainProcess miscDb.addCamera() #80: Camera DB ID: 1
  2023-03-10 13:06:50,223 [INFO] MainProcess indi.updateCcdBlobMode() #400: Set BLOB mode
  2023-03-10 13:06:50,224 [ERROR] MainProcess darks._initialize() #282: CCD day gain below minimum, changing to 1
  2023-03-10 13:06:53,227 [WARNING] MainProcess darks._run() #586: ****** IF THE CCD COOLER WAS ENABLED, YOU MAY CONSIDER STOPPING THIS UNTIL THE SENSOR HAS WARMED ******
  2023-03-10 13:07:01,239 [INFO] MainProcess darks._take_exposures() #662: Temp folder: /tmp/tmpbhab4xsy
  2023-03-10 13:07:01,240 [INFO] MainProcess darks.shoot() #292: Taking 15.00000000 s exposure (gain 1)
  2023-03-10 13:07:01,242 [INFO] MainProcess libcamera.setCcdExposure() #192: image command: libcamera-still --immediate --nopreview --encoding jpg --quality 100 --gain 1 --shutter 15000000 --metadata /tmp/tmpe7tk3vu6.json --metadata-format json --output /tmp/tmptt7bydh5.jpg
  2023-03-10 13:07:41,283 [ERROR] MainProcess libcamera.setCcdExposure() #209: Exposure timeout
  Traceback (most recent call last):
    File "/home/pi/indi-allsky/indi_allsky/camera/libcamera.py", line 207, in setCcdExposure
      self.libcamera_process.wait(timeout=timeout)
    File "/usr/lib/python3.9/subprocess.py", line 1189, in wait
      return self._wait(timeout=timeout)
    File "/usr/lib/python3.9/subprocess.py", line 1911, in _wait
      raise TimeoutExpired(self.args, timeout)
  subprocess.TimeoutExpired: Command '['libcamera-still', '--immediate', '--nopreview', '--encoding', 'jpg', '--quality', '100', '--gain', '1', '--shutter', '15000000', '--metadata', '/tmp/tmpe7tk3vu6.json', '--metadata-format', 'json', '--output', '/tmp/tmptt7bydh5.jpg']' timed out after 40.0 seconds

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/home/pi/indi-allsky/./darks.py", line 90, in <module>
      action_func()
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 432, in sigmaclip
      self._sigmaclip()
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 439, in _sigmaclip
      self._run(IndiAllSkyDarksSigmaClip)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 598, in _run
      self._take_exposures(exposure, dark_filename_t, bpm_filename_t, ccd_bits, stacking_class)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 674, in _take_exposures
      self.shoot(exposure_f, sync=True, timeout=timeout)
    File "/home/pi/indi-allsky/indi_allsky/darks.py", line 294, in shoot
      self.indiclient.setCcdExposure(exposure, sync=sync, timeout=timeout)
    File "/home/pi/indi-allsky/indi_allsky/camera/libcamera.py", line 210, in setCcdExposure
      raise TimeOutException('Timeout waiting for exposure')
  indi_allsky.exceptions.TimeOutException: Timeout waiting for exposure
  BaseClient::~BaseClient: Probability of detecting a deadlock.

I found out that the 'metadata file' in /tmp folder is not being generated.