Server is crashing when exceeding errlog.io quota

elad-bar commented 1 year ago

Issue: Errlog.io has a freeimum account that being used by the application, however, when excceding that quota (of 5k requests per day), it returns an error which causes messages to stay in queue and crash down the service.

When the environment variable CPAI_ERRLOG_APIKEY is empty, there is an exception as the apiKey is invalid, which triggers an exception with the same result.

Use case: I have NVR solution (Shinobi Video) that uses the API (originally wrote it for DeepStack) for any motion detected on specific set of camera, since there are many events during the day of resulted by wind / cats and other events without face, server consider them as errors and tries to report the errlog.io.

Solution: Add validation for errlog.io apikey, if empty or None - print log to console.

elad-bar commented 1 year ago

Great, thanks In general, why not using the default logging component of python?

ChrisMaunder commented 1 year ago

To be brutally honest, a lack of familiarity with Python is part of the reason for not using the basic logging system, and by this I mean we didn't see a simple method of providing a simple way to route log messages to different logging providers using the inbuilt Python logger. All logging has to, eventually, go through the server in order for it to be available to the dashboard, and for it it be a system that all modules, regardless of tech stack, can use. Adding to this the need to have async logging (added in our current dev branch), and it seemed the basic logging was simply a wrapper for stdout with filtering.

But as I said: we're newbies at this so I'm fairly sure, given the maturity of Python, that we're missing something obvious!

elad-bar commented 1 year ago

if you would like I can add some logging once you apply the next fix, sending logs to the API (.net core) will be an IO based operation so the API will log to file and ErrLog.io by itself which is additional 2 IO operations for everytime the processing didn't find file, I think it's too much, I would suggest that each component will log by itself to the console (configuration in .net core applicationsettings.json) and apply some changes to the code of Python (intelliserver) so it will write logs directly to its console.

pls let me know if you would like me to apply these modifications.

ChrisMaunder commented 1 year ago

The important part of the logs is that we need to have them displayed / stored in multiple ways:

We need to store them to the log files so users can send us the logs if there's an issue
We need to display them in the dashboard so the user gets real time updates on what's happening
We need to send (some) logs to our logging API so we can be alerted to issues with new releases in real time
Finally, we dump all logs to the console just so we can see everything unfiltered during development

Our latest changes make this all async so it's not a performance issue.

Ultimately the best solution is to have the .NET, Python, node.js, Swift - whatever module is included - use its own logging mechanism, and our SDK adapter code captures this and routes it to our common logger so we can then filter it, store it, display it, and/or send it to our error logging server based on our AI server's common logging settings

elad-bar commented 1 year ago

I'm just sharing my 2c from expirence I have in developement over the last 20 years, working in pretty big organizations (as developer as well as senior management) with high load services / products with millions of users (including 3rd party developers integrating to the solutions).

my thoughts regarding the needs you have raised:

not an issue, you still get the same result, user can copy/paste logs from console and send them (an alternative solution will be to allow store it to file in addition to the console, that's an overkill solution IMHO)
if we are talking about developer, once again, the logs will be available through out the console, for user, in case using docker, it will be available through the lgos of the container, again, solution of file in addition can cover that part)
sending logs about issues of new version is not common without the option to opt-out from privacy reasons
you can decide which modules you would like to have in which level, same as being done within .net core

I think that the solution you are building is about image / text processing using AI tools, for logs there is a standard, invest in the product (processing) and less in the logging capabilities, for that there are tools (ELK, console, etc...), when logs and documentation for an API are well-descriptive, the users will find the way to work with it.

for any suggestions I have raised I'm fully in to contribute to take it there :)

elad-bar commented 1 year ago

hi @ChrisMaunder , 2 additional thoughts:

when no face are found ,it should be considered as warning at most, not an error
instead of saving logs for real time analysis - maybe a metric api can do the work better, with same pattern as node exporter that conects to promethous and grafana (for dashboard), that way - the need for analytics UI will be through 3rd party solution and will allow focus again on the solution you are providing.

codeproject / CodeProject.AI-Server

Server is crashing when exceeding errlog.io quota #9