open-telemetry / community

OpenTelemetry community content
https://opentelemetry.io
Apache License 2.0
754 stars 229 forks source link

Propose Developer Experience project #2144

Closed tsloughter closed 2 months ago

dmathieu commented 3 months ago

I am interested in joining this project.

yurishkuro commented 3 months ago

I recommend the name Usability, not Convenience.

tsloughter commented 3 months ago

I'm good with a name change if people prefer "usability". I was also considering "developer experience".

tsloughter commented 3 months ago

@yurishkuro what do you think of "developer experience"?

yurishkuro commented 3 months ago

yeah, sounds good

tsloughter commented 3 months ago

@theletterf the plan is a ranked list where we pick the top 3. The 3 is just to give the project a goal to reach rather than an open ended, "fix this list". After the 3 the group can decide whether to propose continuing as a full SIG or disbanding and waiting to regroup when there are resources available again to take on more issues.

How to reach the right users will be worked on with th End User SIG.

UX research interviews? Would that be like a user trying to instrument an application for the first time live with an interviewer watching?

julianocosta89 commented 3 months ago

+1 for "Developer Experience"and I'd like to be involved in this SIG. Thanks @tsloughter for bringing it back to life!

martinjt commented 3 months ago

More than happy to be involved.

One note from skimming through. I believe we should stay away from making developer experience functions, methods etc. Consistent across all languages.

The key issues I see right now are when we try to make something work across all languages the same when there are massive differences between the idiomatic style and flow of different languages and ecosystems.

I think what we need to do is work out how we provide spec flexibility to allow them, while ensuring standards.

svrnm commented 3 months ago

One note from skimming through. I believe we should stay away from making developer experience functions, methods etc. Consistent across all languages. The key issues I see right now are when we try to make something work across all languages the same when there are massive differences between the idiomatic style and flow of different languages and ecosystems.

Those 2 things are not mutually exclusive by default. You might have expected me saying that already, but I say it anyways;-) -- Consistency is a good thing! So it is something worth to strive for as long as it does not compete with other goals of this SIG, which is providing a good developer experience (which I agree the "flow of a language and ecosystem" can be). The keywords for consistency I find important in the proposal are "named differently" and "experiencing different results". Especially the first one is one that hurts us badly already. This can be improved in the proposal, but eventually it's up to the SIG to draw a line between consistency and other criteria for accomplishing a good experience

tsloughter commented 3 months ago

@dmathieu @julianocosta89 @martinjt are you interested in providing prototypes for particular languages? If so I'd like to note next to your names, in addition to company, what language it is will be covered by your involvement. I'd like to have at least 2 or 3 covered before considering the project personnel needs met.

martinjt commented 3 months ago

Happy to take .NET

dmathieu commented 3 months ago

I can take Go

julianocosta89 commented 2 months ago

@dmathieu @julianocosta89 @martinjt are you interested in providing prototypes for particular languages? If so I'd like to note next to your names, in addition to company, what language it is will be covered by your involvement. I'd like to have at least 2 or 3 covered before considering the project personnel needs met.

I can take Java or Rust

tsloughter commented 2 months ago

@martinjt @dmathieu @julianocosta89 ok, great!

I'm adding a part to the doc to cover what @lmolkova brought up about surveying SIGs about existing developer experience improvements they've done and then this should be good to merge and we can move on to slack and setting up a meeting time.

samsp-msft commented 2 months ago

I would like to be involved from the .NET perspective ;-)

stevejgordon commented 2 months ago

I'd be interested in helping with .NET too, if needed.

austinlparker commented 2 months ago

I'm in complete agreement with the proposal, but I think we need to have consensus about the name. We have a proposal for a "contributor experience" SIG, and it will likely be confusing to have a "developer experience" one as well, especially as a good number of contributors are developers.

I don't have a good suggestion, but I know that I'm not the only one with that concern. Therefore, I'm blocking this until we have consensus.

"Developer Experience" and "Contributor Experience" are both terms of art, and for the uninitiated they can be explained in a sentence or two.

samsp-msft commented 2 months ago

What I am interested in is a group dedicated to the experience for Application & Library developers who are consuming OpenTelemetry and using them in their applications. That is a much larger and different audience from those contributing to OpenTelemetry or writing tools such as an APM to consume and display data collected by OpenTelemetry.

Background

On the .NET team, we have included OpenTelemetry as part of the Aspire experience. The Aspire project templates include the OTel libraries, setting up OTLP export and for each component referenced that includes telemetry, we enable it automatically. We added an OTLP endpoint to a dashboard that was originally intended just to show the container running state and have access to the console logs. The dashboard gives a live view of structured logs, metrics and traces, in views conceptually similar to the more established products. The difference is that it's targeted at showing the incoming data in a live fashion. Data is held in memory, there is no storage, no trends, no alerts, no custom dashboard views a-la Grafana. It's much simpler, and more akin to a Wireshark equivalent for OTLP.

My reason for creating the OTel views in the dashboard was to aid developers with adding their own telemetry - reducing the cost and complexity of setting up the combination of Prometheus, Grafana, Yaeger and a logging viewer. The dashboard is launched automatically as part of the F5 (run) experience from Visual Studio.

One of the (to me) unexpected lessons from adding the OTel display features of the Aspire dashboard to the Aspire app development experience is that OTel isn't just for analyzing apps in production - it's equally valuable for analyzing apps during the iterative development loop - especially for distributed apps running as a series of containers, as the same kinds of problems of connectivity, latency will show up at development time. Tracing is a great way to verify if services are interacting in the way that you expect. Logs are the way to understand why a container failed.

If you have and use telemetry as part of the application development phase, then you have a much higher chance that you'll be collecting the right telemetry at runtime.

Takeaways

If the OpenTelemetry community can provide better tools & defaults to developers to make it easier to use and consume telemetry as part of their regular development cycle, then adding telemetry will become second nature as its used day to day as part of their own inner loop, not just by ops in production. Ensuring good telemetry is not a tax, it enables a better local test and diagnostics cycle.

There were a couple of talks at the community day that touched on the concept of using telemetry as part of the E2E testing for applications. I can imagine collecting and analyzing the telemetry produced by a test run in a CI/CD system, comparing the telemetry from previous runs, and highlighting metrics and span differences from a baseline and then either producing a report or failing the run if it exceeds a set threshold.

I think having a group focused on the experience for those adding telemetry to their applications will have an overall increase in the satisfaction of the developers using OTel in their solutions.

tsloughter commented 2 months ago

I've added a section about the use of first talking to language SIG's to get their initial feedback.

tsloughter commented 2 months ago

As for the naming, DevEx/DX is a pretty common and well known term I believe. Enough so to be abbreviated often. Looks like "contributor experience" is as well. So I'd argue they should both remain with the same names.

We could consider something like, "End User Experience", but I think that is too broad while we are focused on developers using the API/SDK.

samsp-msft commented 2 months ago

Do we have infrastructure from the CNCF etc for running surveys. devdiv@Microsoft uses SurveyMonkey as they have good infrastructure and can have org policies for PII management etc. We typically use surveys for two purposes:

The main challenge is getting the survey in-front of the target audience's nose. Github doesn't have good infra for notifying users of a survey (such as having a banner). Pinned discussions depend on the user visiting that area, same for issues.

Second is going to be managing PII, this is an open group, but most participants won't want their information to be spread over the internet - any PII needs to be sandboxed and only used for the stated purpose, and have an expiry.

austinlparker commented 2 months ago

Do we have infrastructure from the CNCF etc for running surveys. devdiv@Microsoft uses SurveyMonkey as they have good infrastructure and can have org policies for PII management etc. We typically use surveys for two purposes:

  • statistical & sentiment analysis. Verbatim responses are useful when categorized to understand high level problem areas
  • Collecting contact details for further follow up in zoom/team calls to do a deeper interview as to the customer experience.

The main challenge is getting the survey in-front of the target audience's nose. Github doesn't have good infra for notifying users of a survey (such as having a banner). Pinned discussions depend on the user visiting that area, same for issues.

Second is going to be managing PII, this is an open group, but most participants won't want their information to be spread over the internet - any PII needs to be sandboxed and only used for the stated purpose, and have an expiry.

We usually just use Google Forms. IIRC CNCF does use SurveyMonkey for stuff, there's probably a way we can get on it. To be quite honest, the biggest challenge the project faces in terms of surveys is (as you intuit) that we do not have a great way to get the survey in front of users other than social media and Slack messages.

That said, we can work with our vendor community to help broadcast/promulgate surveys and other feedback mechanisms, which we'll probably end up doing in this case. My generic preference is that rather than doing time-limited surveys, we build out better continuous reporting mechanisms, but that's kinda neither here nor there.

Regarding PII, we don't really wind up with PII issues because we don't collect any (I'm not sure we even collect e-mails on our existing surveys).

austinlparker commented 2 months ago

What I am interested in is a group dedicated to the experience for Application & Library developers who are consuming OpenTelemetry and using them in their applications. That is a much larger and different audience from those contributing to OpenTelemetry or writing tools such as an APM to consume and display data collected by OpenTelemetry.

While I think we are talking about similar things here, I also feel like perhaps we are not. This SIG -- at least, initially -- is designed to focus on the needs of instrumentors; Developers who are writing OpenTelemetry code in their applications and libraries. While the other things you mention are important, and are part of DevEx, I think it's important for us to keep the initial scope of this SIG constrained.

If this SIG is successful in these initial goals, then it would seem prudent to expand to other topics, but given the amount of backlog the project currently suffers I believe we need to find some 'quick wins' as it were to address the main ergonomic pain points of instrumentors.

tsloughter commented 2 months ago

+1 to everything @austinlparker said.

I completely agree about the importance of the other parts of developer experience, and honestly have looked Aspire as something I want to build for Erlang/Elixir -- like I started doing with Erleans to copy another Microsoft project :) -- but I wanted to purposely keep the scope of this project small to focus the group and not spread us too thin.

Then, things can change in a future DevEx project (or SIG) after the roaring success of this initial iteration :).

Whether to limit the scope of surveying as well can be left to discussion within the project. Maybe we'll want to still surface other areas of improvement in the initial report but limit our focus of work on the ones related to the spec.

As for discussion of how to perform the surveying I think we can leave that to after we've gotten this merged and take it to the project group?

danielgblanco commented 2 months ago

As for discussion of how to perform the surveying I think we can leave that to after we've gotten this merged and take it to the project group?

I agree. The End-User SIG has some basic guidance on how surveys are usually conducted https://github.com/open-telemetry/sig-end-user/tree/main/end-user-surveys.

I also think that scoping this SIG to a few actionable items makes a lot of sense. After this initial effort the SIG can re-evaluate if these efforts should continue on a more long-term basis in its current form (like a permanent SIG) or if there's a better way to tackle those issues in a different way, perhaps involving a combination/collaboration between End-User SIG and implementation SIGs directly.

lmolkova commented 2 months ago

I'm in complete agreement with the proposal, but I think we need to have consensus about the name. We have a proposal for a "contributor experience" SIG, and it will likely be confusing to have a "developer experience" one as well, especially as a good number of contributors are developers.

throwing in some alternative naming suggestions:

martinjt commented 2 months ago

One of the externally facing goals of this, at least in my eyes, is to give the users of OpenTelemetry confidence that we're making it better for them.

While I can understand the desire to try and have very well segregated and defined names internally, I think it will be detrimental to that goal.

We could say "developer experience is an overarching project covering contributors to the project as well as end users", however, I think that would soften the message too much.

Developer Experience is a term that, external to OSS Contributors, is widely known as making the tool easier to use for the people use it.

In my eyes, we should absolutely stick with this being referred to as the "Developer Experience" project, and not try and segment/qualify it. That distinction can be made in the first paragraph and have the same effect, without diminishing the impact on the other experience projects.

austinlparker commented 2 months ago

If anyone has very strong opinions that this project should not be named Developer Experience, I would appreciate their feedback as soon as reasonably possible so that we can close on this and move forward.

jpkrohling commented 2 months ago

I need to catch up with this discussion, but I know that both @tedsuo and @svrnm expressed concerns about the name last week. I like @lmolkova's suggestions, having a slight preference for Instrumentation Development Experience, followed by Instrumentation Experience.

austinlparker commented 2 months ago

I need to catch up with this discussion, but I know that both @tedsuo and @svrnm expressed concerns about the name last week. I like @lmolkova's suggestions, having a slight preference for Instrumentation Development Experience, followed by Instrumentation Experience.

Could @tedsuo and @svrnm post their concerns in this PR so we're not playing telephone then?

svrnm commented 2 months ago

I need to catch up with this discussion, but I know that both @tedsuo and @svrnm expressed concerns about the name last week. I like @lmolkova's suggestions, having a slight preference for Instrumentation Development Experience, followed by Instrumentation Experience.

Could @tedsuo and @svrnm post their concerns in this PR so we're not playing telephone then?

Apologies for not coming back on this issue earlier, I was at KCD Munich and had some other things to follow up with.

I do not have a strong concern with it, I just want to make sure that "Developer Experience" how I understand it and what the SIG plans to do is congruent.

A good definition I found for Developer Experience comes from this github blog post:

DevEx refers to the systems, technology, process, and culture that influence the effectiveness of software development. It looks at all components of a developer’s ecosystem—from environment to workflows to tools—and asks how they are contributing to developer productivity, satisfaction, and operational impact.

Another one from the Microsoft Engineering Fundamentals Playbook:

Developer experience refers to how easy or difficult it is for a developer to perform essential tasks needed to implement a change. A positive developer experience would mean these tasks are relatively easy for the team (see measures below).

The essential tasks are identified below.

Build - Verify that changes are free of syntax error and compile. Test - Verify that all automated tests pass. Start - Launch end-to-end to simulate execution in a deployed environment. Debug - Attach debugger to started solution, set breakpoints, step through code, and inspect variables.

If effort is invested to make these activities as easy as possible, the returns on that effort will increase the longer the project runs, and the larger the team is.

If I read the proposal the focus is on the API/SDK "easy of use" or providing convenience functions to use them:

The first deliverable will be the collection of experience in dealing with developer experience issues by each existing language SIG. This means not only additions to the API/SDK or libraries developed to enhance the experience for users but any that may be planned or are being thought about because of a frequent request from their users.

Based on what people commonly understand under "tools" for developer experience I see the following issue:

Tools is much bigger than what the language APIs/SDKs are doing, and also from the definitions I found and the understand I had, tools people have in mind are IDEs, AI assistance, Automation Tooling for Tests, Helpers for Debugging, maybe even the chair they are sitting on etc. Based on this comment by @samsp-msft and the answer by @austinlparker this seems not to be in scope (initially) and there is the desire to keep the scope small.

So the SIG needs to ask themselves if/how to manage expectations if people assume that all of this is part of what they are going to do? Is this SIG going to look into making OpenTelemetry easier to build, test, start and debug?

If the answers to those questions is "Yes" I have no further objections.

austinlparker commented 2 months ago

So the SIG needs to ask themselves if/how to manage expectations if people assume that all of this is part of what they are going to do? Is this SIG going to look into making OpenTelemetry easier to build, test, start and debug?

If the answers to those questions is "Yes" I have no further objections.

If those are what's needed to make the developer experience better, sure? Better APIs isn't just "there should be declarative span creation sugar", it could also be "APIs should be easier to test" or "SDK configuration should be easier"

The point of calling this Developer Experience is in the spirit of keeping our options open. If we said "oh this is just instrumentation convenience" and there's overwhelming feedback that the biggest problem is, like, SDK initialization or testing, then we'd be artificially limiting ourselves.

Now, do I think that those will be the biggest pain points or the first thing being tackled? No, I suspect that the biggest DevEx problems we have right now are around the actual instrumentation API, but I don't want to presume things that aren't backed by data.

svrnm commented 2 months ago

So the SIG needs to ask themselves if/how to manage expectations if people assume that all of this is part of what they are going to do? Is this SIG going to look into making OpenTelemetry easier to build, test, start and debug? If the answers to those questions is "Yes" I have no further objections.

If those are what's needed to make the developer experience better, sure? Better APIs isn't just "there should be declarative span creation sugar", it could also be "APIs should be easier to test" or "SDK configuration should be easier"

The point of calling this Developer Experience is in the spirit of keeping our options open. If we said "oh this is just instrumentation convenience" and there's overwhelming feedback that the biggest problem is, like, SDK initialization or testing, then we'd be artificially limiting ourselves.

Now, do I think that those will be the biggest pain points or the first thing being tackled? No, I suspect that the biggest DevEx problems we have right now are around the actual instrumentation API, but I don't want to presume things that aren't backed by data.

That sounds good to me, but that's not what I read from the proposal yet. Since this already has a lot of approvals I don't think that has to be changed, and at the end I will also not insist on calling this SIG differently, I just wanted (and still want) to call out that the name "Developer Experience" has a very very broad scope and SIG members need to be prepared to push back with many requests in that domain that they find out of scope, either point in time or general. To give another example that for me is within "DevEx": what if developers think that the biggest problem is that there is no developer friendly OTLP endpoint, will something like OTEP 230 be in scope?

austinlparker commented 2 months ago

That sounds good to me, but that's not what I read from the proposal yet. Since this already has a lot of approvals I don't think that has to be changed, and at the end I will also not insist on calling this SIG differently, I just wanted (and still want) to call out that the name "Developer Experience" has a very very broad scope and SIG members need to be prepared to push back with many requests in that domain that they find out of scope, either point in time or general. To give another example that for me is within "DevEx": what if developers think that the biggest problem is that there is no developer friendly OTLP endpoint, will something like OTEP 230 be in scope?

That wouldn't be in scope because it's quite clearly something that isn't related to the developer experience of OpenTelemetry itself. Out of scope work for the project doesn't magically become in scope because a SIG was formed.

Respectfully, this naming discussion feels like bikeshedding. The initial charter for the SIG in the proposal is clear and concise, and the name 'Developer Experience' clearly distinguishes the goals and purview of the solution space. In the event that this SIG is wildly successful, perhaps it will continue to deliver improvements on Developer Experience in other areas of the field -- there are certainly many of those. However, I continue to see no clear rationale to change the name to something that potentially restricts the solution space and scope of this project before we've even had a chance to do discovery on what problems should be tackled.

svrnm commented 2 months ago

Respectfully, this naming discussion feels like bikeshedding.

I understand that you see it that way and I apologize for getting that deep into this discussion.

I was asked to express my concerns publicly and to unblock this PR, so I did exactly that:

I do not have a strong concern with it, I just want to make sure that "Developer Experience" how I understand it and what the SIG plans to do is congruent.

I got my answer, so from my site there is no further need to discuss this. I approved the PR already with the initial name and so I only can say that my approval remains.

austinlparker commented 2 months ago

Sorry, that was less directed at you and more the generic reader.

tsloughter commented 2 months ago

I don't see any unresolved comments but this is still blocked on 3 reviews.

jpkrohling commented 2 months ago

I'm merging, as I can count 5 GC approvals already and no concerns from the other 4.