[Support]: tmpfs keeps filling up and frigate goes unresponsive

blauter commented 1 year ago

Describe the problem you are having

I am running Frigate 12 beta 6 but also had this issue with prior 11 builds. I was hoping my issue was related to the issue addressed in beta 6 where nginx hwas olding on to files after being deleted, but doesnt appear to be the case.

Frigate runs fine for some time but eventually /tmp/cache will fill up and Frigate will become unresponsive. It will be fine for hours and this directory will only contain 3-6 files but once the issue occurs, it will fill up fast.

# docker exec -it frigate df -h /tmp/cache
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           2.0G  2.0G     0 100% /tmp/cache

I am running in docker on an Ubunutu 22.04 host, virtualized under proxmox. I am using Coral and passing the whole PCI bus to the VM.

Logs aren't showing anything except the regular connections to stats api from Home Assistant, then eventually this.

Version

12 Beta 6

Frigate config file

mqtt:
  host: <ip>
  user: mqtt_frigate
  password: <password>

detectors:
  coral:
    type: edgetpu
    device: usb

rtmp:
  enabled: false

go2rtc:
  streams:
    Doorbell: ffmpeg:rtsp://admin:<password>@<ip>:554/cam/realmonitor?channel=1&subtype=0&stream=main&authbasic=64
    Garage: ffmpeg:rtsp://view:view@<ip>:554/h264_stream
    Front: "ffmpeg:http://<ip>/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=admin&password=<password>"

cameras:
  Doorbell:
    ffmpeg:
      inputs:
        - path: rtsp://admin:<password>@<ip>:554/cam/realmonitor?channel=1&subtype=0&stream=main&authbasic=64
          roles:
            #- rtmp
            - record
        - path: rtsp://admin:<password>@<ip>:554/cam/realmonitor?channel=1&subtype=1&stream=sub&authbasic=64
          roles:
            - detect
    motion:
      mask:
        - 93,318,720,251,720,214,720,201,720,0,0,0,0,484
    detect:
      width: 720
      height: 576
      fps: 15
    snapshots:
      enabled: True
      timestamp: True
      bounding_box: True
      retain:
        default: 14
          #mqtt:
          #timestamp: False
          #bounding_box: False
          #crop: True
          #height: 500

  Garage:
    ffmpeg:
      inputs:
        - path: rtsp://view:view@<ip>:554/h264_stream
          roles:
            #- rtmp
            - record
        - path: rtsp://view:view@<ip>:554/stream1
          roles:
            - detect
    detect:
      width: 640
      height: 360
      fps: 12

  Front:
    ffmpeg:
      hwaccel_args: []
      input_args:
        - -avoid_negative_ts
        - make_zero
        - -fflags
        - nobuffer+genpts+discardcorrupt
        - -flags
        - low_delay
        - -strict
        - experimental
        - -analyzeduration
        - 1000M
        - -probesize
        - 1000M
        - -rw_timeout
        - "5000000"
      inputs:
        - path: http://<ip>/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=admin&password=<password>
          roles:
            - record
              #- rtmp
        - path: http://<ip>/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=admin&password=<password>
          roles:
            - detect
    detect:
      width: 640
      height: 360
      fps: 7

record:
  enabled: True
    #retain_days: 0
  retain:
    days: 7
    mode: motion
  events:
    retain:
      default: 14
      mode: active_objects

snapshots:
  enabled: True
  timestamp: True
  bounding_box: True
  retain:
    default: 10
    objects:
      person: 15

birdseye:
  # Optional: Enable birdseye view (default: shown below)
  enabled: True
  # Optional: Width of the output resolution (default: shown below)
  # width: 1280
  # Optional: Height of the output resolution (default: shown below)
  # height: 720
  # Optional: Encoding quality of the mpeg1 feed (default: shown below)
  # 1 is the highest quality, and 31 is the lowest. Lower quality feeds utilize less CPU resources.
  quality: 1
  # Optional: Mode of the view. Available options are: objects, motion, and continuous
  #   objects - cameras are included if they have had a tracked object within the last 30 seconds
  #   motion - cameras are included if motion was detected in the last 30 seconds
  #   continuous - all cameras are included always
  mode: continuous

ffmpeg:
  hwaccel_args:
    - -hwaccel
    - vaapi
    - -hwaccel_device
    - /dev/dri/renderD128
    - -hwaccel_output_format
    - yuv420p

Relevant log output

2023-02-02 06:06:27.593128601  [2023-02-02 06:06:27] ffmpeg.Garage.record           ERROR   : [segment @ 0x5624d4bf2800] Timestamps are unset in a packet for stream 0. This is deprecated and will stop working in the future. Fix your
 code to set the timestamps properly
2023-02-02 06:06:27.593157544  [2023-02-02 06:06:27] ffmpeg.Garage.record           ERROR   : [segment @ 0x5624d4bf2800] Non-monotonous DTS in output stream 0:0; previous: 0, current: 0; changing to 1. This may result in incorrect t
imestamps in the output file.
2023-02-02 06:06:27.593159886  [2023-02-02 06:06:27] ffmpeg.Garage.record           ERROR   : [segment @ 0x5624d4bf2800] Failure occurred when ending segment '/tmp/cache/Garage-20230202060618.mp4'
2023-02-02 06:06:27.593162473  [2023-02-02 06:06:27] ffmpeg.Garage.record           ERROR   : av_interleaved_write_frame(): No space left on device

FFprobe output from your camera

n/a

Frigate stats

No response

Operating system

Other Linux

Install method

Docker Compose

Coral version

USB

Network connection

Wired

Camera make and model

n/a

Any other information that may be helpful

No response

Docker config:

version: "3.9"
services:
  frigate:
    privileged: true
    container_name: frigate
    restart: unless-stopped
    image: ghcr.io/blakeblackshear/frigate:0.12.0-beta6
    shm_size: "1024mb"
    devices:
      - /dev/bus/usb:/dev/bus/usb
      - /dev/dri/renderD128
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /data/frigate/config/config.yml:/config/config.yml:ro
      - /data/frigate/data:/media/frigate
      - type: tmpfs 
        target: /tmp/cache
        tmpfs:
          size: 2147483648
    ports:
      - "5000:5000"
      - "1935:1935"
      - "8554:8554"
    environment:
      FRIGATE_RTSP_PASSWORD: "<password>"
networks:
  default:
    name: frigate_net

NickM-27 commented 1 year ago

More information is needed. What files are in /tmp/cache ?

blauter commented 1 year ago

Thanks for responding. I restarted to get things working again so I cant grab the current output, but its always a ton of mp4's from all 3 of my cameras.

NickM-27 commented 1 year ago

Thanks for responding. I restarted to get things working again so I cant grab the current output, but its always a ton of mp4's from all 3 of my cameras.

will need to see the list of files specifically, the files could be from a number of different places. also you should update this post to include your config file

blauter commented 1 year ago

Updated original post with Frigate config and docker compose contents.

The issue occurs daily so will grab the file list the next time it occurs. If I recall correctly i was _.mp4

Thanks.

blauter commented 1 year ago

Here is the output currently. When the issue occurs, the files follow same naming convention below, just a lot more of them

root@7ef6a92a5e40:/tmp/cache# ls -al
total 12512
drwxr-xr-x 2 root root     140 Feb  6 18:08 .
drwxrwxrwt 1 root root    4096 Feb  2 01:12 ..
-rw-r--r-- 1 root root 2640980 Feb  6 18:08 Doorbell-20230206180815.mp4
-rw-r--r-- 1 root root  786480 Feb  6 18:08 Doorbell-20230206180824.mp4
-rw-r--r-- 1 root root 6553648 Feb  6 18:08 Front-20230206180817.mp4
-rw-r--r-- 1 root root 2023577 Feb  6 18:08 Garage-20230206180816.mp4
-rw-r--r-- 1 root root  786480 Feb  6 18:08 Garage-20230206180824.mp4

NickM-27 commented 1 year ago

might be related to https://github.com/blakeblackshear/frigate/issues/4699 then

I'm not able to reproduce this so not really sure

blauter commented 1 year ago

My issue does seem related to this one. Anything I can do to troubleshoot the issue myself? I enabled debug logging but anything else you can think of?

And my configuration did work fine on docker under Ubuntu on bare metal but when my disk died, I moved to proxmox to get back up and running.

Unrelated, I am now seeing this on restart.

2023-02-06 18:43:27.009565536  [2023-02-06 18:43:27] frigate.app                    INFO    : Starting Frigate (0.12.0-ea8ec23)
2023-02-06 18:43:27.173283630  [2023-02-06 18:43:27] peewee_migrate                 INFO    : Starting migrations
2023-02-06 18:43:27.415842379  [2023-02-06 18:43:27] peewee_migrate                 INFO    : There is nothing to migrate
2023-02-06 18:43:27.433572939  [2023-02-06 18:43:27] frigate.app                    INFO    : Output process started: 280
2023-02-06 18:43:27.438055231  [2023-02-06 18:43:27] frigate.app                    INFO    : Camera processor started for Doorbell: 284
2023-02-06 18:43:27.442092455  [2023-02-06 18:43:27] frigate.app                    INFO    : Camera processor started for Garage: 286
2023-02-06 18:43:27.446522998  [2023-02-06 18:43:27] detector.coral                 INFO    : Starting detection process: 279
2023-02-06 18:43:30.255397349  [2023-02-06 18:43:27] frigate.detectors.plugins.edgetpu_tfl INFO    : Attempting to load TPU as usb
2023-02-06 18:43:30.255400358  [2023-02-06 18:43:27] frigate.app                    INFO    : Camera processor started for Front: 287
2023-02-06 18:43:30.255404944  [2023-02-06 18:43:27] frigate.app                    INFO    : Capture process started for Doorbell: 291
2023-02-06 18:43:30.255406705  [2023-02-06 18:43:27] frigate.app                    INFO    : Capture process started for Garage: 296
2023-02-06 18:43:30.255417308  [2023-02-06 18:43:27] frigate.app                    INFO    : Capture process started for Front: 298
2023-02-06 18:43:30.297356463  [2023-02-06 18:43:30] frigate.detectors.plugins.edgetpu_tfl INFO    : TPU found
2023-02-06 18:44:28.708695215  Exception in thread recording_cleanup:
2023-02-06 18:44:28.708697948  Traceback (most recent call last):
2023-02-06 18:44:28.708713660    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3237, in execute_sql
2023-02-06 18:44:28.710925433      cursor.execute(sql, params or ())
2023-02-06 18:44:28.710953329  sqlite3.DatabaseError: database disk image is malformed
2023-02-06 18:44:28.710954543  
2023-02-06 18:44:28.710955758  During handling of the above exception, another exception occurred:
2023-02-06 18:44:28.710956642  
2023-02-06 18:44:28.710957636  Traceback (most recent call last):
2023-02-06 18:44:28.710958713    File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
2023-02-06 18:44:28.712051798      self.run()
2023-02-06 18:44:28.712053803    File "/opt/frigate/frigate/record.py", line 630, in run
2023-02-06 18:44:28.712871587      self.expire_recordings()
2023-02-06 18:44:28.712873554    File "/opt/frigate/frigate/record.py", line 486, in expire_recordings
2023-02-06 18:44:28.713059997      for recording in recordings.objects().iterator():
2023-02-06 18:44:28.713062108    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2039, in iterator
2023-02-06 18:44:28.713290988      return iter(self.execute(database).iterator())
2023-02-06 18:44:28.713292952    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1962, in inner
2023-02-06 18:44:28.713527232      return method(self, database, *args, **kwargs)
2023-02-06 18:44:28.713529225    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2033, in execute
2023-02-06 18:44:28.713733360      return self._execute(database)
2023-02-06 18:44:28.713734826    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2206, in _execute
2023-02-06 18:44:28.713963899      cursor = database.execute(self)
2023-02-06 18:44:28.713965502    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3250, in execute
2023-02-06 18:44:28.714306162      return self.execute_sql(sql, params, commit=commit)
2023-02-06 18:44:28.714307565    File "/usr/local/lib/python3.9/dist-packages/playhouse/sqliteq.py", line 249, in execute_sql
2023-02-06 18:44:28.714789100      return self._execute(sql, params, commit=commit)
2023-02-06 18:44:28.714791175    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3244, in execute_sql
2023-02-06 18:44:28.715179095      self.commit()
2023-02-06 18:44:28.715181119    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3010, in __exit__
2023-02-06 18:44:28.715524890      reraise(new_type, new_type(exc_value, *exc_args), traceback)
2023-02-06 18:44:28.715526398    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 192, in reraise
2023-02-06 18:44:28.715612663      raise value.with_traceback(tb)
2023-02-06 18:44:28.715615450    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3237, in execute_sql
2023-02-06 18:44:28.715935424      cursor.execute(sql, params or ())
2023-02-06 18:44:28.715948572  peewee.DatabaseError: database disk image is malformed
2023-02-06 18:48:27.508391468  Exception in thread storage_maintainer:
2023-02-06 18:48:27.508407105  Traceback (most recent call last):
2023-02-06 18:48:27.508408335    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3237, in execute_sql
2023-02-06 18:48:27.509035038      cursor.execute(sql, params or ())
2023-02-06 18:48:27.509164022  sqlite3.DatabaseError: database disk image is malformed
2023-02-06 18:48:27.509165678  
2023-02-06 18:48:27.509166941  During handling of the above exception, another exception occurred:
2023-02-06 18:48:27.509167934  
2023-02-06 18:48:27.509168948  Traceback (most recent call last):
2023-02-06 18:48:27.509170101    File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
2023-02-06 18:48:27.509283942      self.run()
2023-02-06 18:48:27.509285499    File "/opt/frigate/frigate/storage.py", line 186, in run
2023-02-06 18:48:27.509974402      self.calculate_camera_bandwidth()
2023-02-06 18:48:27.509976492    File "/opt/frigate/frigate/storage.py", line 38, in calculate_camera_bandwidth
2023-02-06 18:48:27.510075606      Recordings.select(fn.COUNT(Recordings.id))
2023-02-06 18:48:27.510086396    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1962, in inner
2023-02-06 18:48:27.510334760      return method(self, database, *args, **kwargs)
2023-02-06 18:48:27.510373067    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2227, in scalar
2023-02-06 18:48:27.511008768      row = self.tuples().peek(database)
2023-02-06 18:48:27.511010549    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1962, in inner
2023-02-06 18:48:27.511011653      return method(self, database, *args, **kwargs)
2023-02-06 18:48:27.511012862    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2212, in peek
2023-02-06 18:48:27.511047447      rows = self.execute(database)[:n]
2023-02-06 18:48:27.511048749    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 1962, in inner
2023-02-06 18:48:27.511276752      return method(self, database, *args, **kwargs)
2023-02-06 18:48:27.511308948    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2033, in execute
2023-02-06 18:48:27.511498642      return self._execute(database)
2023-02-06 18:48:27.511500150    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 2206, in _execute
2023-02-06 18:48:27.511733394      cursor = database.execute(self)
2023-02-06 18:48:27.511735516    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3250, in execute
2023-02-06 18:48:27.512052371      return self.execute_sql(sql, params, commit=commit)
2023-02-06 18:48:27.512063685    File "/usr/local/lib/python3.9/dist-packages/playhouse/sqliteq.py", line 249, in execute_sql
2023-02-06 18:48:27.512148390      return self._execute(sql, params, commit=commit)
2023-02-06 18:48:27.512149947    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3244, in execute_sql
2023-02-06 18:48:27.513080729      self.commit()
2023-02-06 18:48:27.513082502    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3010, in __exit__
2023-02-06 18:48:27.513427496      reraise(new_type, new_type(exc_value, *exc_args), traceback)
2023-02-06 18:48:27.513438057    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 192, in reraise
2023-02-06 18:48:27.513500429      raise value.with_traceback(tb)
2023-02-06 18:48:27.513542990    File "/usr/local/lib/python3.9/dist-packages/peewee.py", line 3237, in execute_sql
2023-02-06 18:48:27.513833534      cursor.execute(sql, params or ())
2023-02-06 18:48:27.513844047  peewee.DatabaseError: database disk image is malformed

NickM-27 commented 1 year ago

looks like your DB is corrupt, that might be the reason for the issue

blauter commented 1 year ago

This error just started happening. This was not present the previous 50+ times the issue occurred.

Can I just stop, delete it, and restart?

NickM-27 commented 1 year ago

Can I just stop, delete it, and restart?

yes

blauter commented 1 year ago

Done, thanks. I will report back if the issue happens again. Hopefully with debug logs.

aeozyalcin commented 1 year ago

Hey y'all, I am running into this issue as well. Where my tmpfs is filling up. On my machine, I see that 3.8GB is allocated for tmpfs, and about every 90 minutes, it fills up and causes the camera stream to stall. I went over to 4699 per Nick's suggestion, but it didn't like there was a clear solution.

I have an Intel system, and I am already using VAAPI. Some people suggested switching from QSV to VAAPI solved the issue, but I am already on VAAPI. Any help is appreciated.

edit: I am on Beta-8, but it looks like the issue existed on previous beta versions of v12 as well.

NickM-27 commented 1 year ago

Hey y'all, I am running into this issue as well. Where my tmpfs is filling up. On my machine, I see that 3.8GB is allocated for tmpfs, and about every 90 minutes, it fills up and causes the camera stream to stall. I went over to 4699 per Nick's suggestion, but it didn't like there was a clear solution.

I have an Intel system, and I am already using VAAPI. Some people suggested switching from QSV to VAAPI solved the issue, but I am already on VAAPI. Any help is appreciated.

edit: I am on Beta-8, but it looks like the issue existed on previous beta versions of v12 as well.

as it was requested above more info is needed otherwise there is no way to help. There is no telling what is using that space.

blauter commented 1 year ago

So this issue has stopped occurring daily for me and it has been stable since last week. I can't say for sure it was deleting the database that fixed it as @NickM-27 suggested because I performed a couple other things at the same time, but give it a try as I would put money on that being the solution in my case. I deleted all of my data including DB so you will lose all events/recordings

Also note that I just started receiving the message about the DB issue in the log after upgrading to a later beta version of 12 even though the issue has been around way before I upgraded to 12. Perhaps logging for this type of issue has been improved in later versions and the issue was always there, just not detected?

Two other things I did at the same time:

Updated the BIOS on my motherboard. Nothing on the change log that I can say for sure fixed it but it was pretty old and noticed the listed some improvements around virtualization.
Changed USB port of my Coral. Probably not related. I did this because I was seeing some console messages related to "Transfer event TRB DMA". Messages are still occurring so don't think related

louisr0608 commented 1 year ago

Hi all,

I seem to be having the same or a related issue, though my tempfs never seems to fill completely. All of the runs I gathered this information from were after removing all of the media/database files.

My setup:

Version - 12 beta 8

Docker Compose - frigate_docker_compose.txt

Config - frigate_sanitized_config.txt

Frigate Log - frigate_log.txt

Tempfs Log - frigate_tmpfs_log.txt The tempfs was generated with the following command within the docker container: watch -t -n 1 "(date '+TIME:%H:%M:%S' ; df -h /tmp/cache; ls -al /tmp/cache; echo ''; echo '') | tee -a frigate_tmpfs_log.txt"

My 5 Cameras: Patio - Reolink RLC-810A running at 1440p (this is so I can use the HTTP/h264 main stream rather than the buggier RTSP stream at max resolution) Backyard Left - Same as Patio Backyard Right - Same as Patio Backyard Dogs - Reolink RLC-410 Driveway - Reolink RLC-410

Everything runs fine after some hiccups during startup. However, almost exactly every 84 minutes, all of the FFMPEG processes become unresponsive. The table at the end of my post is extracted from logs from an overnight run (not the same logs attached to this post). The timestamps are those of the "Ffmpeg process crashed unexpectedly for " messages, with the difference showing the time between the messages as 84 minutes +/- a few seconds.

The tempfs log shows an increase of usage of tempfs that aligns with the time that ffmpeg begins to error out in the Frigate log (around 10:31am).

I'm hoping this extra information helps debug this.

Regards

Timestamp | Difference | Camera -- | -- | -- 21:01:42 | | Driveway 22:25:43 | 1:24:01 | Driveway 23:49:44 | 1:24:01 | Driveway 1:13:46 | 1:24:02 | Driveway 2:37:48 | 1:24:02 | Driveway 4:01:42 | 1:23:54 | Driveway 5:25:45 | 1:24:03 | Driveway 6:49:47 | 1:24:02 | Driveway 21:01:42 | | Backyard_dogs 22:25:43 | 1:24:01 | Backyard_dogs 23:49:44 | 1:24:01 | Backyard_dogs 1:13:46 | 1:24:02 | Backyard_dogs 2:37:48 | 1:24:02 | Backyard_dogs 4:01:52 | 1:24:04 | Backyard_dogs 5:25:53 | 1:24:01 | Backyard_dogs 5:26:37 | 0:00:44 | Backyard_dogs 5:27:20 | 0:00:43 | Backyard_dog 6:51:23 | 1:24:03 | Backyard_dogs 21:01:42 | | Backyard_Left 22:25:43 | 1:24:01 | Backyard_Left 23:49:44 | 1:24:01 | Backyard_Left 1:13:46 | 1:24:02 | Backyard_Left 2:37:48 | 1:24:02 | Backyard_Left 4:01:42 | 1:23:54 | Backyard_Left 5:25:45 | 1:24:03 | Backyard_Left 6:49:47 | 1:24:02 | Backyard_Left 21:01:42 | | Backyard_Right 22:25:43 | 1:24:01 | Backyard_Right 23:49:44 | 1:24:01 | Backyard_Right 1:13:46 | 1:24:02 | Backyard_Right 2:37:48 | 1:24:02 | Backyard_Right 4:01:42 | 1:23:54 | Backyard_Right 5:25:45 | 1:24:03 | Backyard_Right 6:49:47 | 1:24:02 | Backyard_Right 21:01:42 | | Patio 22:25:43 | 1:24:01 | Patio 23:49:44 | 1:24:01 | Patio 1:13:46 | 1:24:02 | Patio 2:37:48 | 1:24:02 | Patio 4:01:52 | 1:24:04 | Patio 5:25:53 | 1:24:01 | Patio 5:26:37 | 0:00:44 | Patio 5:27:20 | 0:00:43 | Patio 6:51:23 | 1:24:03 | Patio

NickM-27 commented 1 year ago

@louisr0608 I don't think it'd necessarily the same issue. A few thoughts

Would want to see your go2rrc logs. Most likely they are complaining about the http stream timing out / exceeded duration which is a known issue with reolink cameras. You may want to try using the ffmpeg: modifier on those go2rtc streams
What kind of drive are your recordings stored on? You may want to enable debug logs for the record process and see what the move times are from cache to solid storage, seems like it may not be keeping up.

louisr0608 commented 1 year ago

Hi, thanks for the info.

You're right, go2rtc is coming up with timeout errors at the same time (log below). I'll add the ffmpeg modifier to the streams and give it a try.

The recordings are stored on a spinning HD, and in general the computer I'm running this on is old and slow. I'll enable debug on the record processes...and maybe look into upgrading some things :)

Thanks again!

2023-02-16 09:06:02.749160745 09:06:02.749 INF go2rtc version 1.1.2 linux/amd64 2023-02-16 09:06:02.749416692 09:06:02.749 INF [api] listen addr=:1984 2023-02-16 09:06:02.749693642 09:06:02.749 INF [rtsp] listen addr=:8554 2023-02-16 09:06:02.750033806 09:06:02.750 INF [srtp] listen addr=:8443 2023-02-16 09:06:02.750219330 09:06:02.750 INF [webrtc] listen addr=:8555 2023-02-16 10:29:43.713120439 10:29:43.699 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.232/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password= 2023-02-16 10:29:55.513024402 10:29:55.512 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.230/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=&password= 2023-02-16 10:29:55.513879571 10:29:55.513 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.230/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password= 2023-02-16 10:29:55.513978348 10:29:55.513 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.234/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password= 2023-02-16 10:29:55.514214993 10:29:55.514 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.231/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=&password= 2023-02-16 10:29:55.514261429 10:29:55.514 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.231/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password= 2023-02-16 10:29:55.514306830 10:29:55.514 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.233/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=&password= 2023-02-16 10:29:55.514334018 10:29:55.514 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.233/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password= 2023-02-16 10:30:35.260667174 10:30:35.260 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.234/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=&password= 2023-02-16 10:30:35.260859398 10:30:35.260 WRN github.com/AlexxIT/go2rtc/cmd/streams/producer.go:132 > error="context deadline exceeded (Client.Timeout or context cancellation while reading body)" url=http://192.168.1.232/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=&password=

aeozyalcin commented 1 year ago

@louisr0608 looks like you also have Reolink cameras, just like me. I dove into this and found that changing the ffmpeg input_args fixed it. My theory is the default preset-rtsp-restream preset and Reolink cameras/FFMPeg combo don't mesh well. You would need to comment out the line where you use the rtsp-restream preset, and use the individual settins outlined below. Can you try this and report back?

          input_args:
            - -rtsp_transport 
            - 'tcp'
            - -timeout 
            - '5000000'
            - -analyzeduration
            - 1000M
            - -probesize
            - 1000M
            - -strict
            - experimental
            - -avoid_negative_ts
            - make_zero
            - -fflags
            - +genpts+discardcorrupt
            - -flags
            - low_delay

aeozyalcin commented 1 year ago

One more thing, you would need to switch the Go2RTC source stream from Reolink camera's HTTP stream to the RTSP stream. HTTP stream is just not playing ball after the Go2RTC switch for some reason.

NickM-27 commented 1 year ago

One more thing, you would need to switch the Go2RTC source stream from Reolink camera's HTTP stream to the RTSP stream. HTTP stream is just not playing ball after the Go2RTC switch for some reason.

Many reolink cameras have broken rtsp implementation either with packet loss or pixel smearing. The recommendation in this case would be to either ask reolink support for a firmware with fixed RTSP (I've gotten this for my reolink doorbell and 511WA). Or you'd want to use the http stream in go2rtc except use the ffmpeg: module

aeozyalcin commented 1 year ago

One more thing, you would need to switch the Go2RTC source stream from Reolink camera's HTTP stream to the RTSP stream. HTTP stream is just not playing ball after the Go2RTC switch for some reason.

Many reolink cameras have broken rtsp implementation either with packet loss or pixel smearing. The recommendation in this case would be to either ask reolink support for a firmware with fixed RTSP (I've gotten this for my reolink doorbell and 511WA). Or you'd want to use the http stream in go2rtc except use the ffmpeg: module

Do you have more details on using the ffmpeg module with the Reolink cameras? I looked at the docs, and I see:

go2rtc:
  streams:
    reolink: 
      - http://reolink_ip/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=username&password=password
      - ffmpeg:reolink#audio=opus

As far as I understand, the ffmpeg:reolink#audio=opus line is just for audio transcoding. I didn't think it would have an impact on the video. Is this what you are referring when you say to use the ffmpeg module?

NickM-27 commented 1 year ago

I'm suggesting you should use ffmpeg for everything.

ffmpeg:YOUR_STREAM#video=copy#audio=copy#audio=opus

aeozyalcin commented 1 year ago

I gave it a go, and I am still seeing the same thing. I have 2 cameras, one is fine (AD-410), the other one is not (Reolink 810). When I go to /tmp/cache, I can see that the segment recorded from the Reolink camera just keeps climbing in size, and never gets copied to persistent storage. Whereas the segments recorded from AD-410 stops growing every 10 seconds, and gets copied over like it should.

Here are the combinations I have tried, all with the same failure:

go2rtc:
  streams:
    front_driveway: 
      - http://192.168.0.241/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password

go2rtc:
  streams:
    front_driveway: 
      - ffmpeg:http://192.168.0.241/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password=#video=copy#audio=copy

go2rtc:
  streams:
    front_driveway: 
      - http://192.168.0.241/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=&password
      - ffmpeg:front_driveway#video=copy#audio=copy

NickM-27 commented 1 year ago

Oh, I didn't realize that was the problem being referred to (thought it was the reolink http timeout logs)

I wouldn't expect changing the go2rtc source to affect how recording segments are stored. That'll be related to the record output args. I haven't seen this recording segments not being ended issue with any of my reolink or non reolink cameras.

aeozyalcin commented 1 year ago

Yeah I was referring to tmpfs filling up. In htop, I can see that ffmpeg is clearly told to do 10 second segments, but I am pretty sure ffmpeg is not obeying that. My theory is it has something to do with pts timestamps of the incoming h264 stream.

blakeblackshear commented 1 year ago

In the past I have seen that happen with certain vsync input parameter options.

louisr0608 commented 1 year ago

Thanks for the suggestions. I've switched my go2rtc configuration to use ffmpeg: for all of them, and have upgraded the drive that ffmpeg uses from a HDD to an SSD.

eg. patio_cam:

ffmpeg:http://192.168.1.231/flv?port=1935&app=bcs&stream=channel0_main.bcs.... etc

This has alleviated almost all of my issues.

I have not tried the suggestions from @aeozyalcin yet but will eventually, as I'd prefer to use the higher quality RTSP stream for my 4k cameras. I'll report back then.

Thanks again -Louis

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

blakeblackshear / frigate