statgen / locuszoom

A Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources.
https://statgen.github.io/locuszoom/
MIT License
154 stars 29 forks source link

TypeScript types #274

Closed sir4ur0n closed 2 years ago

sir4ur0n commented 2 years ago

So this is more an open discussion issue than an actual issue/feature request for now :sweat_smile: I want to get the ball rolling, and see where you stand on this.

I have migrated my code base to TypeScript in the last few days, and it makes for a smoother dev experience in my opinion, in particular when libraries provide types.

What do you think about providing LocusZoom TypeScript types? Apparently many libraries publish a @types/xxx NPM library, e.g. @types/locuszoom.

Here is a couple of concrete examples where it would be valuable, based on my (very limited, I admit) use of LZ so far:

interface SetStateOptionsConfigField { display_name: string, value: any // or we could even make this interface generic on the type of value }


* one wants to subclass `AssociationLZ` but contact a different API which responds in a different format. One thus needs to override `_normalizeResponse` to adapt back the response to the format of UMich. With a return type in TypeScript, this would make things much simpler!
```ts
// In LocusZoom library
interface UMichAssocRow {
  analysis: string,
  ref_allele_freq?: number,
  // etc., maybe also add an index type to accept "any extra field"
}

class AssociationLZ {
  // ...
  _normalizeResponse(/* ... */): Array<UMichAssocRow> {
    // ...
  }
}

// In developer code
class CustomAssociationLZ extends AssociationLZ {
  _getUrl(/* ... */): string {
    return "https://my.custom/api";
  }

  _normalizeResponse(/* ... */): Array<UMichAssocRow> {
    // Here it's much simpler to implement, because TypeScript typechecker gives me errors and hints about the structure I need to return
  }
}

I reckon a major drawback is that since there is a significant API surface for LZ, it would take some work to provide the types for all its API. This is the major "problem" in my opinion.

Another con one could argue is that parts of LZ are intrinsically dynamic (e.g. the layout subscription mechanism: it's not really possible to know at compilation time the type of LocusZoom.Layouts.get("foo", "bar")). I don't think this is a real issue: TypeScript typechecker has mechanisms to elegantly step back in those situations, e.g. with any, type casting or index types. 99% of the code would benefit from types, and maybe 1% would not, but would not be blocked either. I think it's ok.

Maybe this could be mitigated by slowly migrating LocusZoom codebase itself to TypeScript? Once LZ is migrated to TS, then it should be a easy - I think - to have/generate its API types and publish to NPM.

abought commented 2 years ago

Thanks for the inquiry.

Broadly speaking, TypeScript does seem neat- but as you mention, not likely on any current timeframe.

Technical barrier: adoption of pre-requisite tools

It's really hard to know how many developers are customizing LZ beyond the default options- like yourself, some of the most custom sites are private internal tools, and so we have to guess a bit about needs.

Investing in TS requires a) that devs doing custom things exist, and b) that they are using fancy tooling to get the benefits of typescript. A lot of the users we hear from are scientists who've never written JS before, and are confused by the extra level of syntax. When I last ran metrics ~2yr ago, > 60% of LZ plots came from PheWebs, which don't use NPM, or a build system- much less typescript. But it's really hard to be sure.

If you notice me asking a lot of users about an example gallery, or to see how they use LZ- that's why! Big investments are easier to justify if we know the audience is there.

Path forward: validation improvements Recent releases (like 0.14) have introduced more error messages to help people know when a layout is asking for data not provided by the adapter. Early versions of LZ tried to be too accommodating of dynamic data, at the expense of clarity about what data was needed. It's a real concern we want to address, and I try to jump on any tasks that make integration easier for people!

We've also (slightly abused) JSDoc to capture what the required config fields for a widget should be (example: set_state). In theory, these annotations could be used as a basis for additional automatic validation in the future.

One possible intermediate road would be something like JSONSchema validation, which puts the focus more on validating data objects (like LocusZoom "layout" config) rather than code. This would work in any environment- even for the large audience of sites like PheWeb where people didn't use TypeScript. But it could be a bit fiddly to get right, because LZ allows "plugins" / "extensions" that change the schema for what's allowed. As we see more customization, we'll try to target the improvements most valuable for what people in a small community want to do. Feedback really does impact the direction we take here!

sir4ur0n commented 2 years ago

I'm quite late to reply to this issue, sorry!

Your answers to "Providing types would probably have low benefits to users unless we have evidence of many LZ users using TS" make sense indeed :+1:

However what are your thoughts on gradually migrating LZ code base to TS? Then LZ maintainers (you mostly, I think? :sweat_smile: ) would reap the benefits (catch bugs earlier; easier refactorings and code changes, like the one in https://github.com/statgen/locuszoom/pull/240) over time.

And the nice feature about TS is that it supports gradual typing, so there is no need for a big bang change. Enable TS in the build, rename .js -> .ts, and the migration is done. Now types can be slowly, gradually added to functions, classes, types, as opportunities present themselves (e.g. work on https://github.com/statgen/locuszoom/pull/240 could include a first lightweight commit to add types on all code that seems related to LD; and then do the refactoring).

My understanding is that it would be transparent for JS users, they would still import the JS code (via NPM, Yarn, or direct download). However TS users would now benefit from all types for free.

Also note that more and more JS LSPs support reading the TS type files even when editing JS files. So actually, even JS users would get more types and help from their tools (assuming they use some tool)

If I summarize (hopefully I'm not forgetting some aspects):

I hope I don't come off as too pushy :sweat_smile: It's perfectly ok to say "No" without justifying yourself for this decision, as you are the maintainer. But I wanted to make sure my thoughts and suggestions were clear (it's not just "More work for you, more benefits for (some) users"; it's also more benefits for you).

Whatever conclusion is reached, thanks for taking the time to answer my issues and questions!

abought commented 2 years ago

Thanks for the enthusiasm!

My general rule is that instead of saying "no", I should help people find the conditions under which we could say "yes".

Right now, our dev community is really small, and most of them are end-users (rather than core maintainers). I'd like to add typescript when we know there will be a tangible benefit to other devs- if we start seeing a shift of more contributors to core, then you will find TS types moving naturally up the list of priorities.

It's definitely a hard balance- the first time I introduced any type annotations, or even ES6, my bioinformatician colleagues took several days to admit that they couldn't read the new syntax! (in trying to make the code more readable, I inadvertently broke their ability to read code)

I've been getting them more into the front end world (via things like Vue.js), which is making it easier for them to adopt other tooling incrementally.