serverless / serverless

⚡ Serverless Framework – Use AWS Lambda and other managed cloud services to build apps that auto-scale, cost nothing when idle, and boast radically low maintenance.
https://serverless.com
MIT License
46.3k stars 5.7k forks source link

Add new features for tracking functionality #1958

Closed pmuens closed 7 years ago

pmuens commented 7 years ago

Our anonymous tracking functionality needs some additional changes so that we can understand the usage of the Serverless framework better.

Here are the proposed changes:

/cc @serverless/core @worldsoup

worldsoup commented 7 years ago

@pmuens let's also track when there's an error so we can see which functions/commands result in an error.

pmuens commented 7 years ago

@worldsoup thanks! Updated the description above!

andymac4182 commented 7 years ago

Will any of this information be public? It would be great to see the info.

pmuens commented 7 years ago

@andymac4182 right now it's used internally to understand the usage patterns and derive the needs for upcoming releases.

flomotlik commented 7 years ago

@andymac4182 we might share this in the future in blogposts (very likely) but we don't have any plans about it yet as its still super early.

Bilal-S commented 7 years ago

Team: I understand the desire to track and improve. However, this is incredibly bad practice and close to grey in ethical as it gets for software.

Good practice: Any tracking, anything, anytime requires explicit user consent.

Bad practice: Automatically tracking anything (even if anonymous), anytime. There are enough industry examples from small and large companies failing face first into the ditch using this practice. I would urge you not to go down this path.

You could ask during project creation whether tracking should be enabled or attempt to find out during installation or ask for volunteers who wish to share via downloading a special build.

I am now compelled to review all code for hidden backdoors and disable.

worldsoup commented 7 years ago

hey @Bilal-S thanks for the concern and input. Our goals is to build the best product we can for our users and at the time we think that this is the best way to do that. We never track any sensitive information (no code, everything is anonymous, etc) and never will.

Tracking usage is a very common practice in the software industry, practically every web-app tracks usage and that data is typically associated with names/emails (not anonymous).

flomotlik commented 7 years ago

One of the things we're also working on for the V1 release is making tracking more obvious, e.g. clearer in the docs and also in commands that you're running after installing serverless. This way it should be clear to everyone that we're tracking command calls, what we're tracking and how to turn it off.

For us it is really important that we deeply understand how our users use the product and we need the broadest and most applicable data possible so we can decide what the next steps are. And of course this data is also important for us to understand as a business because it means we understand our users, understand our growth and can make sure usage of the Framework is growing stronger in the future and we can build more and better functionality with it. Without that there is simply a much larger risk that we're not able to succeed and therefore can't keep working and devoting the time we want to on the framework.

Of course for this to be fair for to users and us it has to be very clear and transparent what data we're taking and how we're using it internally and who we share it with (and in which form)

Bilal-S commented 7 years ago

Florian @flomotlik I appreciate that you have taken time to respond with well laid out reasoning. Though I agree with the need outlined, I respectfully disagree with the implementation.

I agree, that gathering usage statistics to drive development and activity is a common and desirable goal for a software company. However, I find, that companies with foresight tend to break this down further into what this truly means in terms of actionable data gathering. They normally do a quick statistical sample size verification and find they can establish a 99% percentile confidence while sampling only a small percentage of their users. They, then, decide that asking for volunteers is probably a better approach to build trust in the community. Thus, I would challenge you in that the goal to achieve relevant data requires a collection of all users’ transaction data.

Even if the mechanism that collects statistics by default is used, it is made quite obvious; think Firefox install and startup or even Microsoft, the perceived evil-overload of computing’s, windows install. It is far clearer that collection occurs and that there is a switch to turn it off. Now, I understand, that whether that switch is honored is a debate onto itself. Nonetheless, the Serverless team’s effort to show the tracking status clearer is definitely a big step in the right direction and is appreciated.

I still believe all data collection should be opt-in and require consent. Thus people should have the choice to turn it on. It could be as simple as asking during first 25 command prompt executions: “Please help us improve serverless, enable statistics tracking by running command X”.

If you believe in your product this should be an easy decision. Achieving statistically relevant data collection would not be a challenge.

In the end, this is less about me, I can remove the tracking hooks, but rather about the spirit of the company as represented by its actions. Are users valued as partners or just a resource to be dealt with in whatever way benefits the company? Simply said, I urge you to do better.

flomotlik commented 7 years ago

Regarding taking only samples from a subset, while that might work for some tests, I doubt that we can get a good enough picture with only a very limited view of the community there. And furthermore for us as a company its very important not just to be able to determine the usage of specific features, but also to know what the total number of users is (or at least something that is very close to the total number of users). This shows us how we're growing, if we're growing into types of companies that also allow us to monetize around the Serverless framework and allows us to use those aggregate user numbers in investor meetings. For building a long-term sustainable company this is non-trivial and key. So by collecting this completely anonymous and in a way where everybody can see whats happening (and disable it) I think its very fair.

I fully understand your thoughts and hesitations there and we need to make sure that by being very clear about the tracking and how to turn it off that nobody is ever surprised by it. I do have to push very strongly back on one specific point though:

In the end, this is less about me, I can remove the tracking hooks, but rather about the spirit of the company as represented by its actions. Are users valued as partners or just a resource to be dealt with in whatever way benefits the company? Simply said, I urge you to do better.

It is very unfair in my opinion to say either we only track a few people opt-in, or we clearly only see our users as a resource to be exploited. I pride myself a lot in the way we're collaboratively building our community, engaging on every topic, even the ones that are more controversial and never to try to hide things or not be transparent about what we're doing or what our intentions are. We need to make good on our promise of course going forward (in this issue and others) to keep being open and transparent about everything we're doing. And I think at this point we deserve a bit of the benefit of the doubt to show that we're following up on those promises (and of course if we're not I expect and want our community to call us out on it).

Bilal-S commented 7 years ago

Florian thank you for taking time to respond.

pmuens commented 7 years ago

Done with #2101. Furthermore we'll add in-depth documentation about this in #2279 /cc @flomotlik