swyxio / swyxdotio

This is the repo for swyx's blog - Blog content is created in github issues, then posted on swyx.io as blog pages! Comment/watch to follow along my blog within GitHub
https://swyx.io
MIT License
325 stars 43 forks source link

new dxtips post (not published) #435

Closed swyxio closed 1 year ago

swyxio commented 2 years ago

One of the most immediate problems I encountered when starting to work and invest in devtool startups was my lack of a good mental model for assessing them.

I think I have finally found one, and I call it The 4.5 Kinds of Devtool Platforms:

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656996075804/7aIgjZ2E-.png align="center")

Context

If you listen to their marketing, every startup is destined to destroy the legacy way of doing things. Every startup has no competent competitors. Every startup is in the top right of their 2x2. Every startup is going to grow to the moon.

But:

Whether you are thinking about joining a startup or investing in one, the problem is the same - having to turn from a positive-sum "Everyone can win! You're doing great honey!" mindset to a zero-sum mindset. Not because you actually believe there can only be one winner, but because there is only a finite amount of your time and money to invest.

One way to decide is by going top-down instead of bottom-up. As @pmarca often notes - "When a great team meets a lousy market, market wins. When a lousy team meets a great market, market wins." Follow the money, and you can start to impute some hard market sizing numbers on the fuzzy language.

Yet the available models have not really clicked with me. Greylock's Jerry Chen notes 32 distinct investment categories of Castles in the Cloud. Dell's Tyler Jewell made a Developer-Led Landscape chart with 4 groupings of 23 subcategories, and yet no mention of security or networking. (These two are the biggest I can remember, please let me know if I have missed well known devtool landscapes/breakdowns). The numbers of categories don't bother me so much as the fact that they are retroactive backfits. From my time as a hedge fund analyst I know how uncomfortable I felt without a good mental model of my coverage universe, and this is no different.

In our attempt to follow the money, we risk forgetting that money follows tech, not the other way around. When we are investing, we get paid for being nonconsensus and right, and that means identifying new or growing or shifting markets from first principles can have incredible rewards.

The ideal mental model of devtools should be:

After a few years of thinking and searching, I think I have found a top-down model that works. It isn't complete by any means (it lacks a full TAM analysis for each segment), but I think there is enough that I feel I have gained useful insight to share.

The SDLC Approach

One way to sort devtools in a top-down manner is to ask yourself which part of the Software Development LifeCycle (SDLC) the tool corresponds to. Beyang of Sourcegraph breaks it down as such:

  1. Plan and describe what the software should do
  2. Read and understand the code being modified
  3. Write, run, and debug the new code
  4. Test the code
  5. Review the code
  6. Deploy the code
  7. Monitor the code in production and react to incidents

Aside: Some people take issue with the linear SDLC model, so Emily Freeman at AWS has been promoting an alternative model with 6 circles and 6 axes. Full intro video here, but it is too early to tell if this framing will stick.

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656990200233/AP7em9Ypw.png align="left")

This is sufficiently general that it is language agnostic, but detailed enough that you can probably pick your favorite ecosystem (or whatever you use at work) and map specific names of tools and companies to one or more of these stages.

If your company is lucky enough to have an internal developer productivity/infrastructure/experience team, you might even see internal tools that you use to meet those needs. Netflix has a nice simplification of the SDLC into an easier-to-remember three tiers (citation, source):

Another well loved decomposition of the SDLC is the Accelerate/DORA metrics:

While the DORA group was focused on DevOps practices, it's possible to repurpose these ideas to try to understand what process or tooling improvements you can make for the metrics that are underperforming.

Regardless of what SDLC model you adopt, once you understand the gaps in tooling, you can understand the potential of whatever startup you are considering. You can try to quantify time spent on each in terms of engineer-hours spent, assume some % that could be saved with the tool, and somewhat justify what % of that value created will be captured by the company selling the tool and therefore what it could be worth.

The SDLC approach was the primary way I understood developer tools until I notice the sheer number of developer-oriented startups that did not fit neatly into this model.

Most noticeably, Infra startups.

Is Infrastructure a "DevTool"?

My first disillusion with the SDLC framing of developer tools came from talking with enough open source software startups and noticing a common pattern:

Ultimately, the First Principle of Technology is that everything can be broken down into some combination of compute, storage, and networking, and that is what most devtool startups end up charging for anyway:

So shouldn't you really be breaking down tools based on the source of money they are making instead of the random fuzzy unmeasurable multicausal SDLC story they happen to be hawking?

The best perspective on this I have found from David McJannet, CEO of Hashicorp (podcast, chart source):

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656985931619/QAk5vJ3_b.png align="left")

Here are the "tiers" of chargable infrastructure I have gathered so far:

  • note that Networking probably splits into internal networking - between services, inside a VPC - and external, customer facing or egress-type charges - this makes it rather annoying to visualize neatly in a stack diagram so I have simply not bothered with one.

Notice that this "infra-centric" view of the world omits productivity/collaboration tools like GitHub or Jetbrains or Atlassian/Jira, so we have clearly lost generality from the SDLC view of the world. Arguably we are playing fast and loose with the definition of "devtools" already - most other authors I cite are clearly thinking of application developers, whereas I am adopting a more expansive view of how much infra is owned and managed (or at least designed for) by developers. But aren't "Infra as Code" and "Serverless" and other modern movements successfully blurring the distinction?

Still, I like this model better than the SDLC one because it probably leads to a better understanding of how money is made: it is arguably true that the median successful Devinfra startup is more valuable than the median successful Devproductivity startup (citation needed!). From a certain point of view, Devinfra is the "B2B" version of devtools while Devproductivity is "B2C".

We can even try to integrate the SDLC and the Infra centric view of the world with some support for Low/No Code end-user application development. In the Coding Career Handbook I sketched out a "stack" of these layers, inspired by the OSI model:

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656992165416/t1l8EPZGV.png align="left")

This seems like a vaguely workable decomposition of the world. And yet, it is still inadequate.

We've underestimated the role of data.

Money follows Data

From a first principles point of view, if your startup went from running to not running - had data flowing in and out and then suddenly paused - your compute and networking costs would go to 0, but you'd still be charged for data at rest. While storage is a commodity and it trends ever cheaper, there's a certain gravity to data that ties money wherever it goes.

The amount to which I believe that "money follows data" cannot be understated. In my first meeting with Sarah Catanzaro, I remember blurting out my secret belief that "almost all successful companies are really data companies" (if you also include database and/or data model). I even jokingly expanded it into Zawinswyx's Law:

Every startup attempts to expand until it runs a custom database. Those startups which cannot so expand are replaced by ones which can.

Some examples:

Trivially, this is not much bolder of an observation than the fact that every successful startup vies to be a system of record for something (and the fact that the average startup now has ~100 SaaS services all trying to be data silos for your data has led me to be interested in long tail ELT - #plug). But of course, instead of building your own database, you could buy it...

We're getting a little sidetracked. There are two loosely related halves of note in the data world:

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656989543321/y4QeQTpFf.png align="left")

Throw a rock and you'll hit a current or future billionaire in the data world. The opportunity in data alone is demonstrably much bigger than the proportionately tiny space given to it in the Hashicorp diagram, because the Infra view of the world gives equal emphasis to each infra component.

But they are not made equal. Data is special. Give it the attention it deserves.

There's the gatekeeping question again - is "data tooling" really "devtools"? A business analyst building dashboards or slinging SQL, or a data scientist running Tensorflow or PyTorch, wouldn't really consider themselves developers as their primary identity. But more and more of dev workflow and tooling is eating this world too.

Paying for the Periphery

There are further imperfections to the neatly divided Infra-centric Hashicorp/McJannet model. At least two disciplines cut across all segments: monitoring and security.

Notice that the APM function (mostly owned by Datadog (32b), but Dynatrace (12b), New Relic (4b), and Sentry (3b) are in the mix) is drawn as cutting across all parts of the application and infra platform. If you broaden out monitoring to include observability then you may as well throw in Honeycomb and Lightstep as important players in the space.

If you talk to any security folks, they'll also seem wonderfully cross-cutting. Jack Naglieri of Panther breaks a typical security setup into:

So essentially, there is a matching security and monitoring role for every part of the devtools stack that we have already identified, with a little extra for maintenance/response.

You might imagine other centralized services belonging in this bucket. One of the reasons it was surprisingly hard to pitch Temporal was because while workflow engines are often viewed as operational/hot path tools (and the company stood to make more money that way), a lot of the initial usecases were more infrequent, ranging from database migrations to infrastructure provisioning, but the central workflow engine team (like Stripe) would run all of these usecases.

The 4.5 Kinds of Platforms

With thanks to all the previous inspirations, we can put all our insights together into one model:

![image.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1656996075804/7aIgjZ2E-.png align="center")

I haven't yet done the work to put numbers to all these categories yet, but intuitively it feels more right than anything I've yet come across in fitting all the startups I see.