s60sc / ESP32-CAM_MJPEG2SD

ESP32 Camera motion capture application to record JPEGs to SD card as AVI files and stream to browser as MJPEG. If a microphone is installed then a WAV file is also created. Files can be uploaded via FTP or downloaded to browser.
GNU Affero General Public License v3.0
936 stars 216 forks source link
arduino-esp32 avi camera esp32 esp32-cam esp32s3 fpv freenove machine-learning microphone mjpeg motion-capture nvr ov2640 ov5640 sd-card telegram-bot telemetry video-processing wav

ESP32-CAM_MJPEG2SD

Application for ESP32 / ESP32S3 with OV2640 / OV5640 camera to record JPEGs to SD card as AVI files and playback to browser as an MJPEG stream. The AVI format allows recordings to replay at correct frame rate on media players. If a microphone is installed then a WAV file is also created and stored in the AVI file.
The application supports:

The ESP32 cannot support all of the features as it will run out of heap space. For better functionality and performance, use one of the new ESP32S3 camera boards, eg Freenove ESP32S3 Cam, ESP32S3 XIAO Sense, but avoid no-name boards marked ESPS3 RE:1.0

This is a complex app and some users are raising issues when the app reports a warning, but this is the app notifying the user that there is an problem with their setup, which only the user can fix. Be aware that some clone boards have different specs to the original, eg PSRAM size. Please only raise issues for actual bugs (ERR messages, unhandled library error or crash), or to suggest an improvement or enhancement. Thanks.

Changes in version 10.4:

Purpose

The application enables video capture of motion detection or timelapse recording. Examples include security cameras, wildlife monitoring, rocket flight monitoring, FPV vehicle control. This instructable by Max Imagination shows how to build a WiFi Security Camera using an earlier version of this code, plus a later video on how to install and use the app.

Saving a set of JPEGs as a single file is faster than as individual files and is easier to manage, particularly for small image sizes. Actual rate depends on quality and size of SD card and complexity and quality of images. A no-name 4GB SDHC labelled as Class 6 was 3 times slower than a genuine Sandisk 4GB SDHC Class 2. The following recording rates were achieved on a freshly formatted Sandisk 4GB SDHC Class 2 on a AI Thinker OV2640 board, set to maximum JPEG quality and highest clock rate.

Frame Size OV2640 camera max fps mjpeg2sd max fps Detection time ms
96X96 50 45 15
QQVGA 50 45 20
QCIF 50 45 30
HQVGA 50 45 40
240X240 50 45 55
QVGA 50 40 70
CIF 50 40 110
HVGA 50 40 130
VGA 25 20 80
SVGA 25 20 120
XGA 12.5 5 180
HD 12.5 5 220
SXGA 12.5 5 300
UXGA 12.5 5 450

The ESP32S3 (using Freenove ESP32S3 Cam board hosting ESP32S3 N8R8 module) runs the app about double the speed of the ESP32 mainly due to much faster PSRAM. It can record at the maximum OV2640 frame rates including audio for all frame sizes except UXGA (max 10fps).

Design

The ESP32 Cam module has 4MB of PSRAM (8MB on most ESP32S3) which is used to buffer the camera frames and the construction of the AVI file to minimise the number of SD file writes, and optimise the writes by aligning them with the SD card sector size. For playback the AVI is read from SD into a multiple sector sized buffer, and sent to the browser as timed individual frames. The SD card is used in MMC 1 line mode, as this is practically as fast as MMC 4 line mode and frees up pin 4 (connected to onboard Lamp), and pin 12 which can be used for eg a PIR.

The AVI files are named using a date time format YYYYMMDD_HHMMSS with added frame size, FPS recording rate, duration in secs, eg 20200130_201015_VGA_15_60.avi, and stored in a per day folder YYYYMMDD. If audio is included the filename ends with _S. If telemetry is available the filename ends with _M.
The ESP32 time is set from an NTP server or connected browser client.

Installation

Download github files into the Arduino IDE sketch folder, removing -master from the application folder name. If compiling with at least arduino-esp32 core v3.0.3 which contains network fixes. Select the required ESP-CAM board by uncommenting ONE only of the #define CAMERA_MODEL_* in appGlobals.h unless using the one of the defaults:

Optional features are not included by default. To include a feature, in appGlobals.h set relevant #define INCLUDE_* to true.

Select the ESP32 or ESP32S3 Dev Module board and compile with PSRAM enabled and the following Partition scheme:

NOTE:

On first installation, the application will start in wifi AP mode - connect to SSID: ESP-CAMMJPEG..., to allow router to be selected and router password entered via the web page on 192.168.4.1. The configuration data file (except passwords) is automatically created, and the application web pages automatically downloaded from GitHub to the SD card /data folder when an internet connection is available.

Subsequent updates to the application, or to the /data folder files, can be made using the OTA Upload tab. The /data folder can also be reloaded from GitHub using the Reload /data button on the Edit Config tab, or by using a WebDAV client.

An alternative installation process by @ldijkman is described here.

Browser functions only fully tested on Chrome.

Main Function

A recording is generated either by the camera itself detecting motion, or by holding a given pin high (kept low by internal pulldown when released), eg by using an active high motion sensor such as PIR or RCWL-0516 microwave radar. In addition a recording can be requested manually using the Start Recording button on the web page.

To play back a recording, select the file using Playback & File Transfers sidebar button to select the day folder then the required AVI file. After selecting the AVI file, press Start Playback button to playback the recording. The Start Stream button shows a live video only feed from the camera.

Recordings can then be uploaded to an FTP or HTTPS server or downloaded to the browser for playback on a media application, eg VLC. To incorporate FTP or HTTPS server, set #define INCLUDE_FTP_HFS to true.

A time lapse feature is also available which can run in parallel with motion capture. Time lapse files have the format 20200130_201015_VGA_15_60_T.avi

Other Functions and Configuration

The operation of the application can be modified dynamically as below, by using the main web page, which should mostly be self explanatory.

Connections:

To change the recording parameters:

SD storage management:

View application log via web page, displayed using Show Log tab:

Configuration Web Page

More configuration details accessed via Edit Config tab, which displays further buttons:

Wifi: Additional WiFi and webserver settings.

Motion: See Motion detection by Camera section.

Peripherals eg:

To incorporate peripherals, set #define INCLUDE_PERIPH to true.

The Peripherals tab also enables further config tabs to be displayed:

After changes are applied, need to press Save then Reboot ESP to restart peripherals with changes.

Note that there are not enough free pins on the ESP32 camera module to allow all external sensors to be used. Pins that can be used (with some limitations) are: 3, 4, 12, 13, 26, 27 32, 33.

Do not use any other exposed pin including pin 16 used by PSRAM.

The ESP32S3 Freenove board can support multiple peripherals with its spare pins. The ESP32S3 XIAO Sense board has fewer free pins but more than the ESP32.

On-board LEDs:

Other: SD, email, telegram, etc management. To icorporate email (SMTP), set #define INCLUDE_SMTP to true.

When a feature is enable or disabled, the Save button should be used to persist the change, and the ESP should be rebooted using Reboot ESP button.

Motion detection by Camera

An AVI recording can be generated by the camera itself detecting motion using the motionDetect.cpp file.
JPEG images of any size are retrieved from the camera and 1 in N images are sampled on the fly for movement by decoding them to very small grayscale bitmap images which are compared to the previous sample. The small sizes provide smoothing to remove artefacts and reduce processing time.

For movement detection a high sample rate of 1 in 2 is used. When movement has been detected, the rate for checking for movement stop is reduced to 1 in 10 so that the JPEGs can be captured with only a small overhead. The Detection time ms table shows typical time in millis to decode and analyse a frame retrieved from the OV2640 camera.

Motion detection by camera is enabled by default, to disable click off Enable motion detect in Motion Detect & Recording sidebar button.

Additional options are provided on the camera index page, where:

Audio Recording

An I2S microphone eg INMP441 is supported by both ESP32 and ESP32S3. A PDM microphone eg MP34DT01 is only supported on ESP32S3. Audio recording works fine on ESP32S3 but is not viable on ESP32 as it significantly slows down the frame rate.

The audio is formatted as 16 bit single channel PCM with sample rate of 16kHz. An I2S microphone needs 3 free pins, a PDM microphone needs 2 free pins (the I2S SCK pin must be set to -1). Pin values (predefined for XIAO Sense) are set under Audio button on the configuration web page.

The web page has a slider for Microphone Gain. The higher the value the higher the gain for ESP microphone. Selecting 0 cancels the microphone.

The Speaker icon button on the web page can be used to listen to the microphone from the browser.

To incorporate, set #define INCLUDE_AUDIO to true.

Intercom

The intercom feature allows two way conversation between an ESP32 with microphone and amplifier / speaker installed and the device hosting the app web page where the browser has access to the host device microphone and speaker. Access to the device microphone may have security constraints, see audio.cpp. This feature is only viable on an ESP32S3 and needs a good WiFi connection and spatial separation at both ends to prevent a feedback loop.

An I2S amplifier needs 1 free pin on the ESP32S3 if an I2S microphone is installed as they can share the clock pins. Pin values are set under Audio button on the configuration web page.

The web page has a slider for Amplifier Volume. The higher the value the higher the volume for ESP speaker. Selecting 0 cancels the speaker.

On the left side on the main web page are icons for browser device microphone and speaker. Selecting the icon (if not grayed out) activates the browser microphone or speaker.

OV5640

The OV5640 pinout is compatible with boards designed for the OV2640 but the voltage supply is too high for the internal 1.5V regulator, so the camera overheats unless a heat sink is applied.

For recording purposes the OV5640 should only be used with an ESP32S3 board. Frame sizes above FHD framesize should only be used for still images due to memory limitations.

Recordable frame rates for the OV5460 highest framesizes on an ESP32S3 are:

Frame Size FPS
QXSGA 4
WQXGA 5
QXGA 5
QHD 6
FHD 6
P_FHD 6

The OV3660 has not been tested.

Auxiliary Board

To free up pins on the camera board, this app can be installed on both a camera board and an auxiliary board with the latter hosting hardware such as BDC motors, steppers and servos. The communication with the auxiliary board can be either of:

The auxiliary board can be used to drive the hardware for:

To incorporate, set #define INCLUDE_UART to true.

MQTT

To enable MQTT, under Edit Config -> Others tab, enter fields:

MQTT will auto connect if configuration is not blank on ping success.

It will send messages e.g. Record On/Off Motion On/Off to the mqtt broker on channel /status.
topic: homeassistant/sensor/{esp cam hostname}/state -> {"MOTION":"ON", "TIME":"10:07:47.560"}

You can also publish control commands to the /cmd channel in order to control camera. topic: homeassistant/sensor/{esp cam hostname}/cmd -> dbgVerbose=1;framesize=7;fps=1

To incorporate, set #define INCLUDE_MQTT to true.

Home assistant MQTT camera integration

Integration with Home Assistant MQTT Camera contributed by @gemi254 - send mqtt discovery messages to:

To incorporate set #define INCLUDE_HASIO to true.

External Heartbeat

Contributed by @alojzjakob, see also https://github.com/alojzjakob/EspSee

Allow access to multiple cameras behind single dynamic IP with different ports port-forwarded through the router. Another limitation was to avoid using DDNS because it was hard/impossible to set up on given router. You will be able to easily construct list of your cameras with data contained in JSON sent to your server/website.

To enable External Heartbeat, under Edit Config -> Others tab, enter fields:

Heartbeat will be sent every 30 (default) seconds. It will do a POST request to defined domain/URI (i.e. www.mydomain.com/my-esp32cam-hub/index.php) with JSON body, containing useful information you might need for your specific application.

If you are using EspSee, it will do a POST request to defined domain/URI (i.e. https://www.espsee.com/heartbeat/?token=[your_token]) with JSON body, containing useful information about your camera allowing this website to connect it to your user account and provide a way to easily access your camera(s) without the need for DDNS.

If you want to have multiple cameras accessible from the same external IP (behind router) you might need to do port forwarding and set ports on EspSee camera entries accordingly.

To incorporate, set #define INCLUDE_EXTHB to true.

Port Forwarding

To access the app remotely over the internet, set up port forwarding on your router for browser on HTTP port, eg:

image2

On remote device, enter url: your_router_external_ip:10880
To obtain your_router_external_ip value, use eg: https://api.ipify.org
Set a static IP address for your ESP camera device.
For security, Authentication settings should be defined in Access Settings sidebar button.

Note that some internet providers will use CGNAT, which will make port forwarding hard to achieve or impossible (you might need to contact your ISP and ask them for a solution if they are willing to help).

I2C Devices

Multiple I2C devices can share the same two I2C pins. As the camera also uses I2C then the other devices can either share the camera I2C pins or use a separate I2C port. The shared I2C concept was contributed by @rjsachse.

The former approach saves pins, particularly on the ESP32, but generally ESP32 cam boards do not have the pins exposed so some soldering of wires is required. The ESP32S3 boards generally have all pins exposed.

The image shows how wires can be connected to the shared I2C port on the ESP32 AI Thinker style cams. The orange wire is the SDA pin (GPIO26) and the white wire is the SCL pin (GPIO27). Each wire is soldered to the top of an on-board resistor.

By default, the I2C port is shared with the camera, but a separate port can be used by defining alternative SDA and SCL pins under the Peripherals tab.

To incorporate I2C support, set #define INCLUDE_I2C to true. To enable a particular I2C device, set corresponding #define USE_* to true in appGlobals.h.

Telemetry Recording

This feature is better used on an ESP32S3 camera board due to performance and memory limitations on ESP32.

Telemetry such as environmental and motion data (eg from BMP280 and MPU9250 on GY-91 board) can be captured during a camera recording. It is stored in a separate CSV file for presentation in a spreadsheet. The CSV file is named after the corresponding AVI file. A subtitle (SRT) file is also created named after the corresponding AVI file. The CSV and SRT files are uploaded or deleted along with the corresponding AVI file. For downloading, the AVI, CSV and SRT files are bundled into a zip file. If the SRT file is in the same folder as the AVI file, telemetry data subtitles will be displayed by a media player.

The user needs to add the code for the required sensors to the file telemetry.cpp. Contains simple example for the BMP280 and MPU9250 devices.

To switch on telemetry recording, select the Use telemetry recording option bunder the Peripherals button. The frequency of data collection is set by Telemetry collection interval (secs).

Note: esp-camera library conflict if use Adafruit sensor library.

To incorporate, set #define INCLUDE_TELEM to true.

Telegram Bot

Only enable one of Telegram or SMTP email.
Use IDBot to obtain your Chat ID.
Use BotFather to create a Telegram Bot and obtain the Bot Token.
In Edit Config page under Other tab, paste in Telegram chat identifier and Telegram Bot token and select Use Telegram Bot.
You may want to make the bot private.
Note that this feature uses a lot of heap space due to TLS.

The Telegram Bot will now receive motion alerts from the app showing a frame from the recording with a caption containing a command link for the associated recording (max 50MB) which can be downloaded and played.

To incorporate, set #define INCLUDE_TGRAM to true.

Remote Control

Provides for remote control of device on which camera is mounted, e.g RC vehicle for FPV etc.
Best used with ESP32-S3 for frame rate and control responsiveness.

To enable, in Edit Config page under Peripherals, select Enable remote control, then save and reboot This will show an extra config button RC Config.
Pressing the RC Config button will allow pins to be defined for:

Steering can either be provided by servo control, or by track steering using separately controlled left and right side motors.

The streaming view will now have a red button in the top left. Press this to show / hide overlaid steering and motor controls. Camera view buttons can be used to change to full screen. Tethered vehicles can also be controlled via a HW-504 type joystick. Camera view (and microphone and telemetry if enabled) can be recorded.
Motion detection should be disabled beforehand.

This feature can make use of an Auxiliary Board.

To incorporate, set #define INCLUDE_PERIPH to true and #define INCLUDE_MCPWM to true.

Only use this feature if you are familiar with coding and electronics, and can fix issues yourself

Machine Learning

Machine Learning AI can be used to further discriminate whether to save a recording when motion detection has occurred by classsifying whether the object in the frame is of interest, eg a human, type of animal, vehicle etc.

Only feasible on ESP32S3 due to memory use and built in AI Acceleration support.

Only use this feature if you are familiar with Machine Learning

The interface is designed to work with user models packaged as Arduino libraries by the Edge Impulse AI platform. More details in appGlobals.h.

Use 96x96 grayscale or RGB images and train the model with for example the following Transfer learning Neural Network settings:

Camera Hub

This tab enables the web interfaces of other ESP32-CAM_MJPEG2SD camera devices to be accessed. To show this tab, in Edit Config page under Other, select Show Camera Hub tab.

In the tab, enter IP address of another camera and press Add IP button, a screen showing an image from the camera is displayed with its IP address overlayed. Repeat for each camera to be monitored. Click on an image to open the web page for that camera.

Press X icon on image to remove that IP address. Press Delete All button to remove all IP addresses. Press Refresh button to update each screen with the latest image from that camera.

The IP addresses are stored in the browser local storage, not the app itself.

Stream to NVR

This feature is better used on an ESP32S3 camera board due to performance and memory limitations on ESP32.

Streams separate from the web browser are available for capture by a remote NVR. To enable these streams, under Edit Config -> Motion tab, select:

Then save and reboot.

If multiple streams are enabled they need to be processed by an intermediate tool for synchronisation, eg go2rtc (but which does not handle subtitles yet?). See ESP32-CAM_Audio for go2rtc configuration examples. If a recording occurs during streaming it will take priority and the streams may stutter.

WebDAV

A simple WebDAV server is included. A WebDAV client such as Windows 10 File Explorer can be used to access and manage the SD card content. In a folder's address bar enter <ip_address>/webdav, eg 192.168.1.132/webdav
For Windows 11, Android, MacOS, Linux see webDav.cpp file.

To incorporate, set #define INCLUDE_WEBDAV to true

Photogrammetry

ESP can be used to capture a series of photographs of small objects, controlling a stepper motor driven turntable, using either the ESP camera for low resolution images, or a DSLR camera for high resolution images remotely controlled by the ESP. The captured images can be used to generate a 3D model.

To enable this feature, in Edit Config page under Peripherals, select Enable photogrammetry, then save and reboot.
This will show an extra config button PG Config. Pressing this button will bring up options for controlling the photogrammetry process.

This feature can make use of an Auxiliary Board.

See photogram.cpp for more information. To incorporate, set #define INCLUDE_PGRAM to true