dgarnitz / vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
https://www.getvectorflow.com/
Apache License 2.0
676 stars 49 forks source link

Add telemetry #91

Closed dgarnitz closed 1 year ago

dgarnitz commented 1 year ago

What

Added PostHog tracking for vector DB and embedding metadata to provide product analytics to improve usage. Added a flag TELEMETRY_DISABLED=True that turns off this functionality.

This detects a repeat user by creating a config.json with a randomly generated user id in the container's tmp/. This will disappear when the docker container is deleted.

Fixed tests

Verification

The following screenshot shows two events showing the success of a multi-file upload. Then the container was torn down and the TELEMETRY_DISABLED=True set and no events were fired. Then it was removed and you see two more appeared:

image