Closed Swiftnesses closed 1 year ago
Hmm i dont think that would work, thats an image classification model, not an object detection one.
Is your camera very zoomed in on the feeder? If so it could possibly work
@roflcoopter the camera is very close to the feeding tray.
I spent the night working with OpenCV and using the bbox coordinates to resize the image to a working copy, I then pass the resized image to the classifier code. It works, but I'm not sure if it's ideal as the resized images are purely based on the detection box size and do not respect aspect or the Mobile Net 224x224 ideals.
I'm new to this (and python) but trying! If I could just work out.a simple way to resize the images based on the box size I'd be golden!
Ahh so you are running object detection first, which finds a bird, then you pass that image to the classifier?
@roflcoopter trying to, yes, exactly!
I've finally got some working code, I've never written python before, I'm sure it's a nightmare!
import argparse
import time
from PIL import Image
from PIL import ImageDraw
from pycoral.adapters import common
from pycoral.adapters import detect
from pycoral.adapters import classify
from pycoral.utils.dataset import read_label_file
from pycoral.utils.edgetpu import make_interpreter
# ADDED
import requests
import numpy as np
import urllib.request
import cv2
# END ADDED
# HELPERS
def draw_objects(draw, objs, labels):
"""Draws the bounding box and label for each object."""
for obj in objs:
bbox = obj.bbox
draw.rectangle([(bbox.xmin, bbox.ymin), (bbox.xmax, bbox.ymax)],
outline='red')
draw.text((bbox.xmin + 10, bbox.ymin + 10),
'%s\n%.2f' % (labels.get(obj.id, obj.id), obj.score),
fill='red')
# END HELPERS
def main():
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('-f', '--file', help='File path of image to process')
parser.add_argument('-u', '--url', help='URL to input file')
parser.add_argument('-dm', '--detection_model', required=True, help='File path of .tflite file')
parser.add_argument('-dl', '--detection_labels', help='File path of labels file')
parser.add_argument('-cm', '--classification_model', required=True, help='File path of .tflite file')
parser.add_argument('-cl', '--classification_labels', help='File path of labels file')
parser.add_argument('-t', '--threshold', type=float, default=0.4, help='Score threshold for detected objects')
parser.add_argument('-o', '--output', help='File path for the result image with annotations')
parser.add_argument('-c', '--count', type=int, default=5, help='Number of times to run inference')
args = parser.parse_args()
labels = read_label_file(args.detection_labels) if args.detection_labels else {}
interpreter = make_interpreter(args.detection_model)
interpreter.allocate_tensors()
# BEGIN Check input variables
if (args.file and args.url) != None:
print('Only one input can be provided!')
exit()
if (args.file, args.url) == None:
print('At least one input (--file / --url) must be provided!')
exit()
if args.url != None:
print('URL provided')
urllib.request.urlretrieve(args.url, "/data/detect.jpeg")
image = Image.open("/data/detect.jpeg")
if args.file != None:
print('File location provided')
image = Image.open(args.file)
image.save("/data/detect.jpeg")
image = Image.open("/data/detect.jpeg")
# END Check input variables
_, scale = common.set_resized_input(
interpreter, image.size, lambda size: image.resize(size, Image.ANTIALIAS))
print('----DETECT INFERENCE TIME----')
print('Note: The first inference is slow because it includes',
'loading the model into Edge TPU memory.')
for _ in range(args.count):
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
objs = detect.get_objects(interpreter, args.threshold, scale)
print('%.2f ms' % (inference_time * 1000))
print('-------DETECT RESULTS--------')
if not objs:
print('No objects detected')
for obj in objs:
print(labels.get(obj.id, obj.id))
print(' id: ', obj.id)
print(' score: ', obj.score)
print(' bbox: ', obj.bbox)
xmins = obj.bbox.xmin
ymins = obj.bbox.ymin
xmaxs = obj.bbox.xmax
ymaxs = obj.bbox.ymax
if labels.get(obj.id, obj.id) == 'bird':
print("It's a" + " " + labels.get(obj.id, obj.id))
image_to_crop = cv2.imread("/data/detect.jpeg")
image_cropped = image_to_crop[ymins:ymaxs, xmins:xmaxs]
cv2.imwrite("/data/detect_cropped.jpeg", image_cropped)
classification("/data/detect_cropped.jpeg", args.classification_model, args.classification_labels)
if args.output:
image = image.convert('RGB')
draw_objects(ImageDraw.Draw(image), objs, labels)
image.save(args.output)
# image.show()
def classification(img, classification_model, classification_labels):
print(img)
input_mean = 128.0
input_std = 128.0
count = 5
top_k = 1
threshold = 0.0
labels = read_label_file(classification_labels) if classification_labels else {}
interpreter = make_interpreter(*classification_model.split('@'))
interpreter.allocate_tensors()
# Model must be uint8 quantized
if common.input_details(interpreter, 'dtype') != np.uint8:
raise ValueError('Only support uint8 input type.')
size = common.input_size(interpreter)
image = Image.open(img).convert('RGB').resize(size, Image.ANTIALIAS)
# Image data must go through two transforms before running inference:
# 1. normalization: f = (input - mean) / std
# 2. quantization: q = f / scale + zero_point
# The following code combines the two steps as such:
# q = (input - mean) / (std * scale) + zero_point
# However, if std * scale equals 1, and mean - zero_point equals 0, the input
# does not need any preprocessing (but in practice, even if the results are
# very close to 1 and 0, it is probably okay to skip preprocessing for better
# efficiency; we use 1e-5 below instead of absolute zero).
params = common.input_details(interpreter, 'quantization_parameters')
scale = params['scales']
zero_point = params['zero_points']
mean = input_mean
std = input_std
if abs(scale * std - 1) < 1e-5 and abs(mean - zero_point) < 1e-5:
# Input data does not require preprocessing.
common.set_input(interpreter, image)
else:
# Input data requires preprocessing
normalized_input = (np.asarray(image) - mean) / (std * scale) + zero_point
np.clip(normalized_input, 0, 255, out=normalized_input)
common.set_input(interpreter, normalized_input.astype(np.uint8))
# Run inference
print('----CLASSIFICATION INFERENCE TIME----')
print('Note: The first inference on Edge TPU is slow because it includes',
'loading the model into Edge TPU memory.')
for _ in range(count):
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
classes = classify.get_classes(interpreter, top_k, threshold)
print('%.1fms' % (inference_time * 1000))
print('-------CLASSIFICATION RESULTS--------')
for c in classes:
print('%s: %.5f' % (labels.get(c.id, c.id), c.score))
if __name__ == '__main__':
main()
I see, that is exactly how post processors work in Viseron (face recognition for instance) When an object is detected you can mark that label (bird) to be sent to a specific post processor.
The post processor gets both the original image and the cropped bbox image to work with.
So this is basically a plug and play thing since it fits the architecture. I am however working on a huge rewrite so it will take a little while before i can implement it
One issue here tho is that the TPU only supports one model at a time, so it would flip flop between object detection and classification, making it very slow.
A solution to that is to use multiple TPUs or a different object detector
@roflcoopter I have tried Viseron, but it was a little complex for me right now.
My code above seems to work reasonably well, any comments?
I don't envision too much activity, well, unless they're hungry!
All this so the kids can see what's visiting the garden 😄
Looks fine to me! Ill change the issue title so i can remember to implement this as a post processor
@roflcoopter I managed to make a quick flask API, so I can send the arguments to the server and get the classification info back - not bad for a few days learning. It's slow and a bit dumb as I need to keep refreshing the snapshot url of my camera (unifi), but it works (or at least, it should, no notifications since going live!).
I was considering trying to get streaming video working, but likely too much work for me, given my knowledge!
Will lookout for a new release and transition over if / when you complete it. TA.
@roflcoopter, quick question perhaps you can help me with while I wait for the rewrite.
I've changed my code to use an rtsp feed instead of a jpeg snapshot url - it works fine.
Sadly the CPU usage is insane as it doesn't used FFMPEG VAAPI. I initialise the stream using:
cap = cv2.VideoCapture(args.rtsp_stream, cv2.CAP_FFMPEG)
Would you happen to know how to make opencv use ffmpeg "-hwaccel vaapi"?
Sorry missed to reply here. You would have to build ffmpeg with hwaccel support then build OpenCV towards your custom ffmpeg
@Swiftnesses not sure if you are still interested in this but i have included this now in the v2 rewrite
@roflcoopter that sounds amazing - so I'll essentially be able to use it for my bird recognition use case?
Yes exactly!
Oh, this is amazing. Thank you!
@roflcoopter believe it or not, I'm just coming back to this.
It appears that I cannot use my PCI edge device (NUC) for both the object detector AND the imagine classification (and swap between them as discussed). If I set one of them to cpu, it works.
Any ideas?
OK, I have this working with the following config:
# See the README for the full list of configuration options.
ffmpeg:
camera:
bird_feeder:
name: Bird Feeder
host: 192.168.1.3
port: 7447
path: /REDACTED
width: 1920
height: 1080
fps: 30
# codec: h264
# audio_codec: aac
edgetpu:
object_detector:
model_path: /detectors/models/custom/edgetpu/object_detection/tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.tflite
label_path: /detectors/models/custom/edgetpu/object_detection/labels.txt
device: pci
cameras:
bird_feeder:
fps: 30
scan_on_motion_only: false
labels:
- label: bird
confidence: 0.5
trigger_recorder: true
image_classification:
model_path: /detectors/models/custom/cpu/image_classification/mobilenet_v2_1.0_224_inat_bird_quant.tflite
label_path: /detectors/models/custom/cpu/image_classification/labels.txt
device: cpu
cameras:
bird_feeder:
labels:
- bird
labels:
- person
nvr:
bird_feeder:
# MQTT is optional
mqtt:
broker: 192.168.1.16
port: 1883
username: REDACTED
password: REDACTED
The issue I'm having is the image classification is extremely poor, confidence levels never exceed 10% and are always wrong. Moving back to my code above, they're perfect.
I'm so close to replacing my terrible code - I love the MQTT implementation here too, so much potential!
I am using https://coral.ai/models/image-classification/ - iNaturalist 2017 (Birds), same as my code above.
@roflcoopter,
Looking at this further, it appears to be related to the processing of the image (my code was largely taken from Google's repo), before classification. If I skip this in my code (image processing), I also get extremely poor results.
I THINK it's related to some of this?
def classification(classification_model, classification_labels, threshold, set_output_processed_classified, set_output_processed_cropped, cropped_image):
# Horrible workaround to check if image exists (need to fix!)
if not cropped_image.size > 2: # np.shape(cropped_image) == () | cropped_image is None | cropped_image.size>2
logging.info("***Shit***, the classification image isn't valid, [index: " + str(variables.detect_index) + "].")
else:
logging.info("Great, the classification image is valid, [index: " + str(variables.detect_index) + "].")
input_mean = 128.0
input_std = 128.0
count = 1
top_k = 1
logging.debug("Loading {} with {} labels.".format(classification_model, classification_labels))
interpreter_classification = make_interpreter(classification_model)
interpreter_classification.allocate_tensors()
labels_classification = read_label_file(classification_labels) if classification_labels else {}
inference_size_classification = common.input_size(interpreter_classification)
# Model must be uint8 quantized
if common.input_details(interpreter_classification, "dtype") != np.uint8:
raise ValueError("***Shit***, the classification model only supports uint8 input types.")
classification_image = cv2.cvtColor(cropped_image, cv2.COLOR_BGR2RGB)
classification_image = cv2.resize(classification_image, inference_size_classification)
params = common.input_details(interpreter_classification, "quantization_parameters")
scale = params["scales"]
zero_point = params["zero_points"]
mean = input_mean
std = input_std
if abs(scale * std - 1) < 1e-5 and abs(mean - zero_point) < 1e-5:
# Input data does not require preprocessing
common.set_input(interpreter_classification, classification_image)
else:
# Input data requires preprocessing
normalized_input = (np.asarray(classification_image) - mean) / (std * scale) + zero_point
np.clip(normalized_input, 0, 255, out=normalized_input)
common.set_input(interpreter_classification, normalized_input.astype(np.uint8))
# Run inference on TPU
logging.debug("----CLASSIFICATION INFERENCE TIME----")
logging.debug("Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.")
for _ in range(count):
start = time.perf_counter()
interpreter_classification.invoke()
inference_time = time.perf_counter() - start
classification_objs = classify.get_classes(interpreter_classification, top_k, threshold)
logging.debug("%.1fms" % (inference_time * 1000))
Seems related to my classification model being quantized?
Thanks for your detailed report!
Very interesting findings, seems that the EdgeTPU API has changed quite a bit compared to the example i have coded against.
Also this example does it a bit differently than both you and me.
Might be some magic going on in the run_inference
method which i am not using today.
Looking in to it!
@roflcoopter thank you :)
I noticed that my processed images are very odd colours if I save them (using the code above which was from an example somewhere). If I comment out the normalisation piece of my code they’re normal but detection is almost useless - likely not related as the example you linked doesn’t appear to do anything special, again unless it related to my model being quantised - I have no idea about that tbh.
Let me know if you want me to test anything.
Its all a bit unclear to me as well. On the coral models page it says this:
Beware that the EfficientNet family of models have unique input quantization values (scale and zero-point) that you must use when preprocessing your input. For example preprocessing code, see the classify_image.py or classify_image.cc examples.
But they dont mention the INaturalist models being quantized
The only giveaway is “quant” in the model name…
That does explain why my code works - I took my code snippet from the links you used. So essentially are we saying if it’s a quantised model we should need to process it differently? Perhaps we could just have a flag to indicate the model type, or am I over simplifying things due to lack of experience?!
But then this example doesn’t do anything and seems to work and uses the same model. Urgh.
https://github.com/google-coral/edgetpu/blob/master/examples/classify_image.py
But then this example doesn’t do anything and seems to work and uses the same model. Urgh.
https://github.com/google-coral/edgetpu/blob/master/examples/classify_image.py
Oh so this example works well for you?
Ha. I haven’t tried it, just took it for granted. Sent from my iPhoneOn 26 Nov 2022, at 12:17, Jesper @.***> wrote:
But then this example doesn’t do anything and seems to work and uses the same model. Urgh. https://github.com/google-coral/edgetpu/blob/master/examples/classify_image.py
Oh so this example works well for you?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you modified the open/close state.Message ID: @.***>
Ahh, that example uses an old deprecated API, edgetpu
, its called pycoral
these days so that is not a good idea to implement.
However it works fine with the code provided in https://github.com/google-coral/pycoral/blob/9972f8e/examples/classify_image.py, no need for a flag or anything since it figues out on its own if preprocessing is needed or not.
I already do other preprocessing so i just have to add this. Will fix and push soon.
Gotcha, that's basically the code I use, I see now it decides if processing is required.
Can't wait to test it!
BTW, is the classification data only output via MQTT (and subsequently to Home Assistant)? I kind of expected it to be included in the recording or snapshot, but I only see the object on snapshot and nothing on the recordings, just wondering if I'm missing something!
Really appreciate your help on this, thank you :)
It is not included in the recording, post processors runs after an object is detected but is completely detached from the recorder.
So yes it is only sent over MQTT.
I just pushed #400 which will be in the dev
tag shortly!
The build is complete now, please test it out when you get the chance!
@roflcoopter, no luck I'm afraid - it now detects different birds and background (much more than before), but it's never the right bird! To test both my code and yours, I simply hold up picture of a European Bluetit on my phone! My code gets it 95% of the time.
Hmm interesting, could you link the image you are testing?
Sure, I just used the image attached, the other picture is the resulting push notification on my phone.
Here is my current working code for reference if it helps:
# General
import argparse
import time
import numpy as np
import requests
from datetime import datetime
# OpenCV
import cv2
# Logging
import logging
import os
import sys
logging.basicConfig(
format="%(asctime)s | %(levelname)s | %(name)s | %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
level=os.environ.get("LOGLEVEL", "INFO"),
stream=sys.stdout,
)
# PYCoral requirements
from pycoral.utils.dataset import read_label_file
from pycoral.utils.edgetpu import make_interpreter
from pycoral.utils.edgetpu import run_inference
from pycoral.adapters import common
from pycoral.adapters import classify
from pycoral.adapters import detect
# Required for health check thread
from threading import Timer
# Improves capture speed
import threading
from queue import Queue
# Imported function to send email to Zapier
from send_email import send_email
# Set directory paths (relative)
script_directory = os.path.dirname(os.path.realpath(__file__))
processed_directory = "processed"
# START MAIN SCRIPT
class variables():
bird_tracker = {}
result_bird = None # {}
detect_sucess = None # False
detect_index = None # 0
current_time = None # datetime.utcnow().strftime("%Y-%m-%d_%H-%M-%S-%f")[:-3]
# START HEARTBEAT SENDER
def heartbeat():
Timer(60, heartbeat).start ()
url = "http://redacted:redacted@192.168.1.16:1880/endpoint/bird_feeder"
requests.post(url, data = {"heartbeat": "OK"})
logging.info("Sending heartbeat to Node-RED...")
heartbeat()
# END HEARTBEAT SENDER
def main():
default_model_dir = "/data/models/detection"
default_detect_model = "tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.tflite" # "tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.tflite" / "efficientdet_lite3_512_ptq_edgetpu.tflite" /
default_detect_labels = "tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.txt" # "tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.txt" / "efficientdet_lite3_512_ptq_edgetpu.txt" /
default_classification_model = "/data/models/classification/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite"
default_classification_labels = "/data/models/classification/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.txt"
parser = argparse.ArgumentParser()
parser.add_argument("--detect_model", help=".tflite model path", default=os.path.join(default_model_dir, default_detect_model))
parser.add_argument("--detect_labels", help="label file path", default=os.path.join(default_model_dir, default_detect_labels))
parser.add_argument("--classification_model", help=".tflite model path", default=os.path.join(default_model_dir, default_classification_model))
parser.add_argument("--classification_labels", help="label file path", default=os.path.join(default_model_dir, default_classification_labels))
parser.add_argument("--top_k", type=int, default=1, help="number of categories with highest score to display")
parser.add_argument("--rtsp_stream", required=True, help="The full rtsp stream url")
parser.add_argument("--detect_threshold", type=float, default=0.7, help="detect score threshold")
parser.add_argument("--classification_threshold", type=float, default=0.7, help="classification score threshold")
parser.add_argument("--set_output_processed", default=False, action="store_true", help="save the processed image with object boxes")
parser.add_argument("--set_output_processed_cropped", default=False, action="store_true", help="save the processed cropped images used for classification")
parser.add_argument("--set_output_processed_classified", default=False, action="store_true", help="save the processed classified images")
parser.add_argument("--snooze_time", type=int, default=3600, help="set time before detecting the same species again") # default is 1 hour
args = parser.parse_args()
logging.debug("Loading {} with {} labels.".format(args.detect_model, args.detect_labels))
interpreter_detect = make_interpreter(args.detect_model)
interpreter_detect.allocate_tensors()
labels_detect = read_label_file(args.detect_labels)
inference_size_detect = common.input_size(interpreter_detect)
q = Queue(maxsize = 1) # avoids a backlog in memory while processing frames, we only require the capture the latest frame...
def receive():
while True:
try:
cap = cv2.VideoCapture(args.rtsp_stream)
ret, frame = cap.read()
q.put(frame)
while ret:
ret, frame = cap.read()
q.put(frame)
except:
logging.info("No camera FEED found... will keep trying.")
time.sleep(5)
def process():
while True:
try:
if q.empty() != True:
frame = q.get()
cv2_im = frame
cv2_im_rgb = cv2.cvtColor(cv2_im, cv2.COLOR_BGR2RGB)
cv2_im_rgb = cv2.resize(cv2_im_rgb, inference_size_detect)
# cv2.imwrite("/data/streamer/tests/test_" + str(variables.current_time) + ".jpeg", frame)
run_inference(interpreter_detect, cv2_im_rgb.tobytes())
detect_objs = detect.get_objects(interpreter_detect, args.detect_threshold)
# Updates current time for use as timestamp in all functions
variables.current_time = datetime.utcnow().strftime("%Y-%m-%d_%H-%M-%S-%f")[:-3]
# cv2.imwrite("/data/streamer/tests/test_image_" + str(variables.current_time) + ".jpeg", frame)
# Create empty dictionary to hold detection / classification data
variables.result_bird = {"name": [], "score": [], "detect_url": [], "classification_url": [], "cropped_url": []}
# Reset detections
variables.detect_sucess = False
if not detect_objs:
logging.debug("-----------DETECT RESULT-------------")
logging.debug("No objects detected.")
# for index in range(len(detect_objs)):
for index, obj in enumerate(detect_objs):
logging.debug("-----------DETECT RESULT-------------")
# logging.debug(labels_detect.get(obj.id, obj.id))
# logging.debug(' id: ', obj.id)
# logging.debug(' score: ', obj.score)
# logging.debug(' bbox: ', obj.bbox)
if labels_detect.get(obj.id, obj.id) == "bird":
variables.detect_index = index
logging.info("Bird object detected, sending to classification model [index: " + str(variables.detect_index) + "].")
# Crop the image for classification
cropped_image = crop_image(cv2_im, inference_size_detect, obj)
# Run interference on TPU (using cropped image)
classification(args.classification_model, args.classification_labels, args.classification_threshold, args.set_output_processed_classified, args.set_output_processed_cropped, cropped_image)
# If required, save frame that was processed
if args.set_output_processed and variables.detect_sucess:
processed_image = append_objs_to_img(cv2_im, inference_size_detect, detect_objs, labels_detect)
save_processed_image(processed_image)
# Process results
if variables.detect_sucess:
process_results()
# UPDATE TRACKER DICTIONARY
# Create filter to check dictionary with
now = time.time()
filter_v = now - args.snooze_time
# Check dictionary and remove old entries
for k, v in list(variables.bird_tracker.items()):
if v <= filter_v:
del variables.bird_tracker[k]
logging.info("Removed bird species: " + "'" + k + "'" + " from the snooze list...")
except:
logging.info("No camera FRAME found... will keep trying.")
time.sleep(5)
receive_thread = threading.Thread(target=receive, daemon=True)
process_thread = threading.Thread(target=process)
receive_thread.start()
process_thread.start()
process_thread.join()
def classification(classification_model, classification_labels, threshold, set_output_processed_classified, set_output_processed_cropped, cropped_image):
# Horrible workaround to check if image exists (need to fix!)
if not cropped_image.size > 2: # np.shape(cropped_image) == () | cropped_image is None | cropped_image.size>2
logging.info("***Shit***, the classification image isn't valid, [index: " + str(variables.detect_index) + "].")
else:
logging.info("Great, the classification image is valid, [index: " + str(variables.detect_index) + "].")
input_mean = 128.0
input_std = 128.0
count = 1
top_k = 1
logging.debug("Loading {} with {} labels.".format(classification_model, classification_labels))
interpreter_classification = make_interpreter(classification_model)
interpreter_classification.allocate_tensors()
labels_classification = read_label_file(classification_labels) if classification_labels else {}
inference_size_classification = common.input_size(interpreter_classification)
# Model must be uint8 quantized
if common.input_details(interpreter_classification, "dtype") != np.uint8:
raise ValueError("***Shit***, the classification model only supports uint8 input types.")
classification_image = cv2.cvtColor(cropped_image, cv2.COLOR_BGR2RGB)
classification_image = cv2.resize(classification_image, inference_size_classification)
params = common.input_details(interpreter_classification, "quantization_parameters")
scale = params["scales"]
zero_point = params["zero_points"]
mean = input_mean
std = input_std
if abs(scale * std - 1) < 1e-5 and abs(mean - zero_point) < 1e-5:
# Input data does not require preprocessing
common.set_input(interpreter_classification, classification_image)
else:
# Input data requires preprocessing
normalized_input = (np.asarray(classification_image) - mean) / (std * scale) + zero_point
np.clip(normalized_input, 0, 255, out=normalized_input)
common.set_input(interpreter_classification, normalized_input.astype(np.uint8))
# Run inference on TPU
logging.debug("----CLASSIFICATION INFERENCE TIME----")
logging.debug("Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.")
for _ in range(count):
start = time.perf_counter()
interpreter_classification.invoke()
inference_time = time.perf_counter() - start
classification_objs = classify.get_classes(interpreter_classification, top_k, threshold)
logging.debug("%.1fms" % (inference_time * 1000))
logging.debug("-------CLASSIFICATION RESULT---------")
for c in classification_objs:
logging.debug("%s: %.5f" % (labels_classification.get(c.id, c.id), c.score))
# Check if bird has been detected recently - Note: can cause issues with testing as multiple bird images don't often align due to this check.
if (labels_classification.get(c.id)) not in variables.bird_tracker:
# Process only if threshold is acceptable
if c.score >= threshold:
# Convert 0.xxxxxxxx to xx
percent = str(int(100 * c.score)) + "%"
label = str(labels_classification.get(c.id))
logging.info("Successful classification: " + label + " @" + percent + " accuracy")
# Use to track if a succesful classification has occured
variables.detect_sucess = True
# Track specicies during this cycle
variables.result_bird["name"].append(label)
variables.result_bird["score"].append(percent)
# Track found bird to manage snoozes
variables.bird_tracker[label] = time.time()
logging.info("Adding bird species: " + "'" + label + "'" + " to the snooze list...")
# Save classification image
if set_output_processed_classified:
save_classification_image(classification_image, "classification", label, percent)
# Save cropped image
if set_output_processed_cropped:
save_classification_image(cropped_image, "cropped", label, percent)
else:
logging.info("Bird species recently detected, snoozing...")
def process_results():
if variables.result_bird == {"name": [], "score": [], "detect_url": [], "classification_url": [], "cropped_url": []}:
logging.debug("Cycle finished, no birds found.")
else:
logging.info("Cycle finished, found the following birds:\n" + str(variables.result_bird))
# Send information to Zapier via email
for k,v in variables.result_bird.items():
# print(k)
if k == 'name':
species_number = len(v)
species = (v)
if k == 'score':
score_number = len(v)
score = (v)
if k == 'detect_url':
image_url = v[0]
species = (str(species).strip('[]').replace('\'', '').replace(',', ' &'))
score = (str(score).strip('[]').replace('\'', '').replace(',', ' &'))#print(score)
body = "Number: {}\nSpecies: {}\nAccuracy: {}".format(species_number, species, score)
send_email("redacted@robot.zapier.com", "New bird(s) detected!", body, image_url)
# Send information to Node Red
files = {'upload_file': open(image_url,'rb')}
url = "http://redacted:redacted@192.168.1.16:1880/endpoint/bird_feeder"
requests.post(url, files = files, data = variables.result_bird)
# START IMAGE HELPERS
def append_objs_to_img(image, inference_size_detect, objs, labels):
logging.info("Appending labels to detect image [in memory]...")
height, width, channels = image.shape
scale_x, scale_y = width / inference_size_detect[0], height / inference_size_detect[1]
for obj in objs:
bbox = obj.bbox.scale(scale_x, scale_y)
x0, y0 = int(bbox.xmin), int(bbox.ymin)
x1, y1 = int(bbox.xmax), int(bbox.ymax)
percent = int(100 * obj.score)
label = "{}% {}".format(percent, labels.get(obj.id, obj.id))
image = cv2.rectangle(image, (x0, y0), (x1, y1), (0, 255, 0), 2)
image = cv2.putText(image, label, (x0, y0 + 30), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (255, 0, 0), 2)
return image
def crop_image(image, inference_size_detect, obj):
logging.info("Cropping detect image ready classification [in memory]...")
height, width, channels = image.shape
scale_x, scale_y = width / inference_size_detect[0], height / inference_size_detect[1]
bbox = obj.bbox.scale(scale_x, scale_y)
x0, y0 = int(bbox.xmin), int(bbox.ymin)
x1, y1 = int(bbox.xmax), int(bbox.ymax)
crop_correction = 0
image = image[y0 - crop_correction : y1 + crop_correction, x0 - crop_correction : x1 + crop_correction]
return image
def save_processed_image(image):
logging.info("Saving detect image [to disk]...")
detect_filename = "detect_" + str(variables.current_time) + ".jpeg"
detect_filepath = os.path.join(script_directory, processed_directory, detect_filename)
variables.result_bird["detect_url"].append(detect_filepath)
cv2.imwrite(detect_filepath, image)
def save_classification_image (image, image_type, label, percent):
logging.info("Saving " + image_type + " image [to disk]...")
image_filename = image_type + "_" + str(variables.current_time) + "_[index_" + str(variables.detect_index) + "]_" + label + "_" + percent + ".jpeg"
image_filepath = os.path.join(script_directory, processed_directory, image_filename)
variables.result_bird[image_type + "_url"].append(image_filepath)
cv2.imwrite(image_filepath, image)
# END IMAGE HELPERS
if __name__ == "__main__":
main()
Here's my current Viseron config too:
# See the README for the full list of configuration options.
logger:
# default_level: debug
# logs:
# viseron.components.ffmpeg: debug
# viseron.components.edgetpu: debug
# cameras:
# bird_feeder: debug
ffmpeg:
camera:
viseron_bird_feeder:
name: Viseron - Bird Feeder
host: 192.168.1.3
port: 7447
path: /redacted
# width: 1920
# height: 1080
# fps: 30
# codec: h264
# audio_codec: aac
edgetpu:
object_detector:
model_path: /detectors/models/custom/edgetpu/object_detection/tf2_ssd_mobilenet_v2_coco17_ptq_edgetpu.tflite
label_path: /detectors/models/custom/edgetpu/object_detection/labels.txt
device: pci
cameras:
viseron_bird_feeder:
# log_all_objects: true
fps: 10
scan_on_motion_only: false
labels:
- label: bird
confidence: 0.6
trigger_recorder: true
image_classification:
model_path: /detectors/models/custom/cpu/image_classification/mobilenet_v2_1.0_224_inat_bird_quant.tflite
label_path: /detectors/models/custom/cpu/image_classification/labels.txt
device: cpu
cameras:
viseron_bird_feeder:
labels:
- bird
# labels:
# - person
nvr:
viseron_bird_feeder:
# MQTT is optional
mqtt:
broker: 192.168.1.16
port: 1883
username: redacted
password: redacted
home_assistant:
discovery_prefix: homeassistant
retain_config: true
Good morning.
Can you explain these stats to me - I'm a little confused why I see this FPS per camera, and not for the Viseron platform?
How do I know if the edgetpu is overloaded? Currently I have 6 cameras, most at 10fps, one at 30fps.
Those stats are for the object detector, not for the camera. If the EdgeTPU was overloaded those numbers would decrease.
If you use the same object detector for multiple cameras they will have the same values.
I am trying really hard with the image classification btw but i cant seem to figure out whats wrong :( Will keep looking
I've been working hard on my config, now setup substreams and have all 12 cameras added!
Fingers crossed you can get classification working, have you tested my code and confirmed it works for you btw?
Wow 12 cameras, thats a lot!
No i have not yet, i have been double checking everything and it seems correct. Trying to setup a good test environment atm but i dont have any spare cameras to use at the computer so need to look at alternatives
I use my Unifi Protect rtsp streams, works perfect. Just started to use the substream option on a lower resolution stream, seems to work well.
I'm still struggling ti understand the fps indicator, if I lower the objector fps on all my feeds, it goes up. Vice Versa. I'm trying to understand how I understand I'm at the limit... Sorry if that sounds dumb!
I also think this HA sensor should be type number or statistic (better), so we can see graph data :)
Adding temp for the USB and PCIE edgetpu would also be awesome!
I have a pcie card and I trigger cat /sys/class/apex/apex_0/temp
Any luck @roflcoopter?
Ran out of time today, will take a stab at it again during the week!
Regarding your other questions, the fps indicator shows at what fps the inference is working at, to give you an indication of how fast a particular model is for instance. A higher number means the inference is very fast, lower numbers means you are getting closer to max performance.
And yes it should definitely be a number, will fix
Doesnt seem to matter how i twist and turn my code, the confidence levels are always very low.
Could you give me an example of how you invoke your script and i can try and see if i get similar results?
I just run the code above, and feed it with an RTSP feed (arg), I then use my phone to show the picture linked above. Confidence is around 80% first time.
Good day,
Currently using Frigate, but would love to try some of the other coral models available.
Example, My kids would love to know what birds are visiting the feeder (currently monitored via a UniFi camera with rtsp feeds available). Is it possible to use the MobileNet V2 model from this page?
https://coral.ai/models/image-classification/
If so, could you provide some pointers to get me started? I’ll be installing via Docker on Debian and have a PCI TPU card. The host is a NUC with Intel Quick Sync available.
Many thanks in advance.