Open ipspace opened 3 days ago
Some very draft ideas:
Receiving and storing data can be achieved using cloudflare workers+kv or D1 storage, or with AWS Lambda+DynamoDB (plus putting some limits on it).
To demonstrate that we do not collect sensible data we could also show the collected data and some reporting?
(Edit: if we need more resources for collecting and storing data we could apply for this? https://blog.cloudflare.com/expanding-our-support-for-oss-projects-with-project-alexandria )
Ivan's proposed collection mechanism is in plain-text yml dictionary , so any user can actually see the data collected, and the upload is user triggered, so I guess this covers the issue.
I personally would be interested with what host OSes Netlab is used as well.
To demonstrate that we do not collect sensible data we could also show the collected data and some reporting?
I personally would be interested with what host OSes Netlab is used as well.
Agree. Would you please document how we could collect that in a way that would work on most Linux distros while still providing reasonably easy-to-interpret results?
For example, uname -a
produces a printout that someone might be able to deduce Ubuntu release from, but it's way beyond my capabilities. Anyway, according to this https://gist.github.com/natefoo/814c5bf936922dad97ff, the whole thing is a bit of a mess
Receiving and storing data can be achieved using cloudflare workers+kv or D1 storage, or with AWS Lambda+DynamoDB (plus putting some limits on it).
These days I would definitely go with CF workers + KV/D1/R2
To demonstrate that we do not collect sensible data we could also show the collected data and some reporting?
"The user could inspect the usage data with netlab usage show" ;) https://github.com/ipspace/netlab/blob/dev/docs/roadmap/usage.md?plain=1#L19
Ok for the inspection of collected data, but seeing some "reporting stats" could be interesting imho
Ok for the inspection of collected data, but seeing some "reporting stats" could be interesting imho
That's why I was thinking a GitHub repo might be a nice option - it puts the (anonymized) reported stats in a public place that people can go look at if they want to - not hidden in some backend database
I like this, conceptually. There is nothing like letting the user watch the data.
That's why I was thinking a GitHub repo might be a nice option - it puts the (anonymized) reported stats in a public place that people can go look at if they want to - not hidden in some backend database
Sure, Ill look into it, and yes, you are right, this can be a can of worms. I had to fight it recently with cmake , their linux detection sucks so I had to overwrite the variables.
Agree. Would you please document how we could collect that in a way that would work on most Linux distros while still providing reasonably easy-to-interpret results?
I like this, conceptually. There is nothing like letting the user watch the data.
That's why I was thinking a GitHub repo might be a nice option - it puts the (anonymized) reported stats in a public place that people can go look at if they want to - not hidden in some backend database
Maybe we could even talk to GitHub and make this into an officially supported feature. Usage data for open source projects voluntarily provided by GitHub users would be a great addition - I think many projects would use that
Perhaps the best way to determine the OS name without descending into madness is to use a systemd component, hostnamectl. It will return the correct distro name in its output. It will of course only work on systems using systemd but in 2024 all mainstream distros use it. Where it will fail are musl lib C based distros, which still use alternate init systems by necessity (Alpine, Void Linux, Chimera) and specialty distributions (embeded ... whatever).
Agree. Would you please document how we could collect that in a way that would work on most Linux distros while still providing reasonably easy-to-interpret results?
It would be great to know how people use netlab; currently, we can only guess as we get little feedback and zero hard data.
The proposal to implement the usage data collection and eventual upload is in
docs/roadmaps/usage.md
. Feedback or PRs against that file are most welcome.