igvteam / igv.js

Embeddable genomic visualization component based on the Integrative Genomics Viewer
MIT License
643 stars 229 forks source link

Open to continuous integration and ES6 export? #674

Closed eweitz closed 6 years ago

eweitz commented 6 years ago

Hi IGV team! If you're amenable to it, I'd like to implement a few infrastructural improvements for igv.js: continuous integration and ES6 export. These have helped Ideogram.js, a library I develop that's inspired by your work, and I'd like to contribute similar enhancements to igv.js.

Continuous integration would automatically run your existing test suite when you push a commit. CI helps keep code robust, and raises confidence among developers that a codebase has a test suite and passes it. Travis CI is a free service for this and integrates well with GitHub.

ES6 export would let developers easily embed igv.js as a component using JavaScript frameworks like Angular, React, and Vue. This ES6 / ES2015 syntax for import and export derives from module systems like AMD and CommonJS, and puts their best parts into a formal language standard.

Would you be open to CI and ES6 export? If so, I'll open pull requests to support them. Let me know!

zoldello commented 6 years ago

Well, even if the team is not amenable (will break many others, having higher priority issues, etc); you can fork the code and go in the modern direction you propose. The license of IGV is MIT.

jrobinso commented 6 years ago

We are amenable to pull requests in this direction, but backward compatibility and not breaking dependent websites is a given. Forking is a bad idea as this is an active project and you will soon be out-of-date.

eweitz commented 6 years ago

Thanks Jim, I agree. I'll work in https://github.com/eweitz/igv.js, keep you updated on my progress, and open those PRs when they're ready.

zoldello commented 6 years ago

Jim, what you said makes sense. Eric, are you working on the update solo or are you going to create issues so others can contribute?

eweitz commented 6 years ago

Hi Phil, I'll likely work more or less solo on this for now. I'll open more granular issues as needed if I get stuck or if there's significant related work remaining when I'm done. Thanks for the interest!

jrobinso commented 6 years ago

Modernization will proceed in phases. The first phase, which Eric is taking on, will be to make the igv.js (and igv.min.js) files proper ES6 modules. Modernization of the internal igv.js code will proceed on an incremental and somewhat cautious basis. There is a large user base with many installations and stability is paramount. I'm all on board with the direction however.

romanzenka commented 6 years ago

@eweitz I looked into this a bit. The headless tests do not work (PhantomJS not supporting new javascript syntax), so things like Travis CI are not an option at the moment.

I really like the idea that you do not need node.js to run the unit tests - all you need is a browser. I vote for keeping it "oldschol" - it is easy to cram new and fancy JavaScript features into the project, but with each new gizmo you lose somebody who cannot use said gizmo. Also gizmos go unsupported over time.

I was thinking about using Puppeteer and keeping the exact setup there is now, just making it work with Grunt. That would give you CI support but not break current way of running tests.

As to the packaging, that could be a special build setup that produces a npm ES6 module side by side with the current minimized .js file. I admire igv.js for the sheer simplicity (you just need a browser, no silly Webpack/Babel compilation nonsense). I would recommend building on top of that, but retaining the original simplicity as long as we can.

jrobinso commented 6 years ago

Hi @romanzenka @eweitz I looked into ES6 packaging a little bit. I think the igv.js file could be wrapped as an ES6 module just by adding the appropriate statements to https://github.com/igvteam/igv.js/blob/master/wrapper/header.js. I have not tried it yet, but this is the first solution I would try. If it does not work we can look at creating a separate module.

I agree re the unit tests, they would be hard to automate and are not critical or anywhere near complete. Mainly they test file parsers.

Thank you for the kind words @romanzenka the simplicity is by design, I want it to be easy to use by anyone.

eweitz commented 6 years ago

As to the packaging, that could be a special build setup that produces a npm ES6 module side by side with the current minimized .js file

Yes, that's basically the setup I use to enable Ideogram.js to work via ES6 import and a traditional script tag. See e.g. installation notes, package.json, index.js, and ideogram.min.js.

I think the igv.js file could be wrapped as an ES6 module just by adding the appropriate statements to https://github.com/igvteam/igv.js/blob/master/wrapper/header.js.

I tried that as well, but wasn't able to get it to work. I'll give it another try. If it turns out to be infeasible, I'll pursue the path that involves the least change to igv.js.

I was thinking about using Puppeteer and keeping the exact setup there is now, just making it work with Grunt. That would give you CI support but not break current way of running tests.

Agreed. I actually began that work here. I'll be resuming it this week.

Webpack/Babel compilation

I'm being a bit off-topic here, but I would recommend those tools. They can yield benefits like drastically smaller code size via tree shaking, which improves time-to-interactive. But I respect that this project prefers not to use them, at least not at this time. My goal is to get ES6 export and CI working with Grunt.

jrobinso commented 6 years ago

I would prefer a single file, or rather 2, igv.js and igv.min.js, rather than a separate file for ES6. I don't feel too strongly about this, however I would like to understand why it doesn't work. As a test you could try just hand-editing a built igv.js file and replacing the AMD & Commonjs statements with ES6 ones.

I just pushed an example using AMD loading with require.js, see examples/igv-require.html. You must run grunt to build the dist directory before running this example.

eweitz commented 6 years ago

Thanks Jim, I'll take a look at those files.

romanzenka commented 6 years ago

@jrobinso I see two issues with the NPM resulting from this process:

That said, despite these "purity" concerns, what you did works for me and I am already using this IGV npm as a dependency of my project without issues (I need all of it, I have no conflicts with your extensions).

To modularize IGV would mean using require throughout your entire codebase. I do not know how to do that without breaking backwards compatibility, and I do not know how to do that automatically during build - someone would have to manually specify that e.g. FeatureTrack requires BinaryReader and so on, unless we cleverly extracted these dependencies. It sounds like too much work with very minimal benefit (embedding portions of IGV would allow someone save up to ~1MB of javascript if they decided they need only small subset of IGV features). I do not see people being too gung-ho about using IGV this way though.

jrobinso commented 6 years ago

@romanzenka Agree, I have no intention of trying to minimize igv or radically alter the current process.

There is no dead code in IGV itself, a tree shake could very probably remove a lot of the embedded jquery. My long term plan is to get rid of JQuery in any event.

Thanks for the reminder re global state and polyfills. I had begun removing those but probably got distracted by things like adding requested features and removing bugs. However, I'll have a look. We shouldn't need any polyfills at this point, we are not even attempting to support older browsers.

How are you using an NPM module in a browser?

romanzenka commented 6 years ago

@jrobinso I have a project that uses Webpack, so it simply gets compiled together with my code into a single javascript file.

eweitz commented 6 years ago

I agree with much of Roman's comment. Some refinements and extensions are below. This is merely food for thought, as at this point we're discussing things that will not be done in the near term.

To modularize IGV would mean using require throughout your entire codebase.

If that work is taken on in the future, I'd recommend using import.

I do not know how to do that without breaking backwards compatibility, and I do not know how to do that automatically during build - someone would have to manually specify that e.g. FeatureTrack requires BinaryReader and so on, unless we cleverly extracted these dependencies.

The modularization done for D3 4.0 would be a good reference to consult if such an architecture change is desired in the future. Each module could be a separate repo with its own package.json. The code in the top-level igv.js repo could simply integrate those module repos. See e.g. how the top-level d3 repo has no src directory and merely an index.js and package.json that integrates modules from d3-selection, d3-brush, etc.

It sounds like too much work with very minimal benefit (embedding portions of IGV would allow someone save up to ~1MB of javascript if they decided they need only small subset of IGV features).

It would require substantial work. But I think it would have more than very-minimal benefit. Beyond enabling downstream developers to save parse-time for, let's say, 1 MB of JavaScript on each page load (which is huge!), it would also enable them to easily re-use modules outside of a traditional genome-browser context.

For example, after some minor adjustments, bamUtils.js and bgzf.js strike me as components that could be re-used in a much wider set of bioinformatics web applications. Modularizing such generic code would enable it to be evolved more easily, too.

I do not see people being too gung-ho about using IGV this way though.

That's probably right. The audience for such enhancements would be not be numerically large. It would mostly benefit use cases where initialization performance is important, and developers of bioinformatics web components.

Jim's current focus on "adding requested features and removing bugs" is just right, in my opinion. If more radical changes are considered in the future, hopefully this comment provides helpful context.

jrobinso commented 6 years ago

Tree shaking could be done post-build on the igv.js file, no?

However, with respect to size, ~3/4 of the size is actual igv code, and there is no dead code there that can be shaken out, so I'm not sure where the savings would come from.

Finally, igv.js is designed for single-page applications, and 1 mb is not significant in comparison to the genomic data that is going to be loaded to view. It's not even the size of a single bam index file.

eweitz commented 6 years ago

I think we all agree that any significant work on igv.js payload-size optimizations would be best addressed in a separate issue at a later time. Tree shaking and modularization are independent of each other, and independent of ES6 export and CI.

jrobinso commented 6 years ago

I'm not sure I agree that payload-size matters with igv.js, I would need to know the use case. In a typical IGV session several hundred mb of data will be loaded in short order, cutting 1.2 mb in 1/2 is insignificant. Also, as noted, most of that 1.2 mb is needed.

jrobinso commented 6 years ago

In other news, I have igv.js mostly working as an es6 module. Pull, run grunt, and look at examples/igv-es6.html. I say mostly working for 2 reasons (1) If the module igv.es6.js is in any directory other than examples/js the browser can't find it, with a 404, even though it is clearly using the correct url (2) There are failures due to coding bugs in strict mode. These are real bugs. Mostly undeclared variables. I'll have to run them down one at a time. These need addressed anyway, they are polluting the global namespace.

Nonetheless enough is working to validate the modularization.

I am out tomorrow but will pick up running down the strict mode errors on Thursday.

jrobinso commented 6 years ago

I could not, BTW, figure out a clever way to use the same igv.js file for ES6 modules and everything else. The current igv.js file checks for AMD, then CommonJS, and if those aren't detected declares the global window.igv. It needs to do that for users not using a module system. If there is a way to detect at the time the self-evaluating factory in igv.js is run that it was invoked from an es6 "module" script (type = module) then we could do it there, or declare the global if not. However, for now I'm just going to create a separate file -> igv.es6.js.

BTW I had to remove the minimization from Grunt because it depends on the harmony branch of the uglifier. This can't be loaded as a dependency automatically, it has to be installed. So we minimize as an external step. When the standard NPM uglifier module supports ES6 I will restore the minimification step.

romanzenka commented 6 years ago

@jrobinso I am using a drop-in replacement for the grunt uglifier grunt-contrib-uglify-es and successfully uglifying IGV with it. I use these newer dependencies for my work:

"devDependencies": {
    "grunt": "1.0.3",
    "grunt-cli": "1.2.0",
    "grunt-contrib-concat": "1.0.1",
    "grunt-contrib-uglify-es": "3.3.0"
  }
romanzenka commented 6 years ago

@eweitz Actually I'm guilty as charged. I have written a custom build of igv.js that actually does exactly what you described - extracts the track code, does not keep the browser itself, so I can integrate igv tracks seamlessly with a custom framework from another project. So there IS a way for people to do this sort of trick, question is if this is something you want to "force" upon everyone by going the full D3 and splitting what is a single, simple repository into a pile of sub-repositories.

I think the main question here is - who is actively developing igv.js now, and would they be willing to pay this extra complexity for the benefit it brings.

My vision of igv.js is that it is incredibly democratic project where you as a common scientist without extensive JavaScript ecosystem knowledge can use the tool and contribute to it with minimal hassle. But I am newcomer to this project so that is just how I understand the culture of igv.js as I see it.

jrobinso commented 6 years ago

@romanzenka Thanks for the uglifier tip. I will give it a try.

jrobinso commented 6 years ago

@eweitz Back to the topic at hand. The gunt task now wraps igv.js as an ES6 module in the file "dist/igv.es6.js". I will add minification later. I was experimenting to see what was involved and it just worked. I don't think you could have gotten there quickly as the main problems were bugs in igv itself, mostly missing declarations. See examples/igv-es6.html and examples/js/main-es6.js.

eweitz commented 6 years ago

@jrobinso, your ES6 export in dist/igv.es6.js enables igv.js in to be used with React at (see https://github.com/eweitz/igv.js-react), but only after a modification of the package.json used in the igv npm package.

package.json in igv@2.0.0-rc2 uses:

"main": "dist/igv.min.js"

To enable ES6 import in React and other popular JavaScript frameworks, that would need to be changed to:

"main": "dist/igv.es6.js"

The package.json for igv.js has no known dependents in npm or in GitHub. However, there are likely unknown dependents using require to import igv.min.js after installing the npm package, and enabling developers to use either import or require in such cases would be good.

I think the next step here, for me, is to test how this ES6 export solution works when a package is installed via npm and imported via require.

jrobinso commented 6 years ago

Thanks @eweitz . If there were a way to determine at runtime if igv.js is being loaded as an ES6 module it would be possible to have a single file (igv.js) that works for all systems or no system. This would add a test to header.js, if(es6) do the es6 thing, otherwise define window.igv. Also in footer to add the export statement, or not.

jrobinso commented 6 years ago

Actually I'm not sure its possible to conditionalize the export statement like that. But it might be.

jrobinso commented 6 years ago

@eweitz So I've done some investigation and am convinced we will need separate igv.js files, and hence (I assume) separate NPM packages for ES6 modules. There is no way to make the same file compatible with other (or no) module system & ES6. This is unfortunate but it appears to be the rules. The problem is with the export statement, it can't be present for non-ES6 use, and can't be conditionalized.

I thought I was very close, determining how to export igv was easy enough with the following logic, this is very standard up to the last bit where I test for presence of the global. I tested and it works with ES6

(function (root, factory) {

        if (typeof define === 'function' && define.amd) {
            // AMD. Register as an anonymous module.
            define([], factory);
        } else if (typeof module === 'object' && module.exports) {
            // Node. Does not work with strict CommonJS, but
            // only CommonJS-like environments that support module.exports,
            // like Node.
            module.exports = factory();
        } else if (root) {
            // Export to global (no module system)
            root.igv = factory();
        }
        else {
            // Assumed to be ES6 module
            igv = factory();
        }

    }
    (this, function () {

However if trying to run with other module systems or no module system it fails because of the export statement in the footer

export default igv

I have many users using require, and the standard NPM package, with React, Angular, and many more users using igv.js with no module system. Obviously we can't break there systems for the sake of ES6 modules. So a separate package seems to be the best option. Do you see another way?

Suggestions for naming the new package? igv-es6?

eweitz commented 6 years ago

Thanks for the investigation, Jim. I would actually propose using main: "dist/igv.es6.js" for the standard NPM package, for reasons detailed below.

There is no way to make the same file compatible with other (or no) module system & ES6.

I don't think that would be an issue. Those who use script tags could point to igv.min.js, and those who use ES6 could use the standard NPM package. Those who use other module systems could do so as they do now, which does not seem to use the NPM package.

I have many users using require, and the standard NPM package, with React, Angular, and many more users using igv.js with no module system. Obviously we can't break there systems for the sake of ES6 modules.

Could you link me to a dependent using the igv NPM package?

Of the dozen-or-so apps I've seen embedding igv.js, none declare it as a dependency in their package.json and all pull it in by pointing to a URL at build- or run-time.

And while there are over 128 K hits for "igv.js", there are 0 public repos on GitHub depending on igv.js's package.json. (The same GitHub report shows several dependents for my lesser-known project, so it doesn't seem to have false negatives for public GitHub repo dependents.) NPM also reports no package.json dependents.

I agree that many dependents use require for igv.js, but the available information does not show that anyone uses require and the standard NPM package. I suspect everyone using require for igv.js includes the IGV code by pointing their build system to a custom URL like http://igv.org/web/release/1.0.6/igv-1.0.6.min.js, along the lines seen here, or simply copies igv.js into their local vendor directory, or some such.

So it does not seem that changing the NPM package would break backwards compatibility, because it seems none of the many public apps that use igv.js use its NPM package. Any responsible dependents would pin to a specific version of igv.js in their package.json, so future upstream versions would not break their apps.

It's also worth keeping in mind that the upcoming release already breaks backwards compatibility. So even though I would assert that changing the NPM package would not break backwards compatibility (because it seems noone uses it), now is the time to make any such forward-leaning changes.

All modern tutorials use import over require. This official standard is recommended in React, Angular, and Vue. It is supported even without frameworks, in web browsers. With require basically deprecated in the JavaScript ecosystem, and for other reasons outlined above, I'd recommend supporting ES6 import by default in the standard igv NPM package.

jrobinso commented 6 years ago

@eweitz I know we have users who use NPM to get IGV, and who do not use ES6 modules, because if they did it wouldn't work. These might not show up in a dependency search, but the assertions that noone uses it is wrong for sure. @paul-shannon is one such user, perhaps he could comment. I have no requests for ES6 module support other than this current thread, so I'm hesitant to just break those other module systems without a good reason, particularly when there is a good alternative to support separate NPM packages.

jrobinso commented 6 years ago

This might be a dumb question, but why can't both js files be included in the NPM package? Then you import from the one you need depending on the module system you use.

eweitz commented 6 years ago

why can't both js files be included in the NPM package?

They could, but it seems there is no decent way to easily enable both import and require from the same NPM package. The package.json main property determines what file is used for export, and that value can only be one file. There are similar package.json properties like bin and browser, and somewhat exotic workarounds, but it seems all the alternatives either don't fit igv.js's use case or would be awkward and inadvisable.

I'm hesitant to just break those other module systems

This would not break those non-standard module systems. It might require a small change in dependents' package.json when they want to upgrade to a new version of igv.js, but nothing being proposed here would cause their apps to suddenly stop working.

Assuming import and require aren't compatible in one NPM package, I see two options:

  1. Support import in the default package, and provide a separate igv-require package
  2. Support require in the default package, and provide a separate igv-es6 package

Neither of those options would break anything.

The question here is: how do you want to position the upcoming major release of igv.js?

Option 1, making import the easier approach, would ensure that more new dependents (and more modern existing dependents) could easily include the default package like they do their other dependencies. ES6 import has already replaced require as the preferred way to include dependencies in all major JS frameworks and most modern JS libraries. That trend will only increase during the life of igv.js 2.x.

Option 2, making require the easier approach, would ensure that the few existing dependents using the igv NPM package can continue using the default package as they do now. Although some dependents use igv via NPM and require (thanks for pointing out that example), given embedders I know and my GitHub searches of igv.js usage patterns, I suspect the vast majority include igv.js by other means described in my previous comment. And those few who get igv.js via NPM but do not migrate from the non-standard require syntax would still be supported by a separate igv-require package.

It's your call to make. I think the case for Option 1 is stronger, but either option would work.

jrobinso commented 6 years ago

I agree very few get igv.js via NPM, and they can change their package structure. For sure the default artifacts from the standard build will be igv.js and igv.min.js, es6 qualifiers for the es6 module file. By definition, there are no es6 module users of igv.js yet, unless they have done the wrapping themselves (possible).

I'm going to leave this open for a few days and see if we get comments. I think we understand what needs done, we're now just talking about naming the packages.

jrobinso commented 6 years ago

BTW, since you are more expert than me wrt all things NPM, could you post what you did to make the es6 module work here?

jrobinso commented 6 years ago

BTW @eweitz Why is NPM so central to what you're doing? That package manager has no direct connection with ES6. We have somehow morphed this issue, which is ES6 support, to an NPM packaging request. It feels like it should be a separate git issue.

jrobinso commented 6 years ago

@eweitz Could you comment on this article and how it impacts ES modules in the node environment? Particularly the requirement for an ".mjs" extension. Again probably best to continue this in a new issue but we're here at the moment. https://medium.com/@giltayar/native-es-modules-in-nodejs-status-and-future-directions-part-i-ee5ea3001f71

I can't find anything on this in the NPM docs: https://docs.npmjs.com/getting-started/packages

eweitz commented 6 years ago

what you did to make the es6 module work here?

Sure. To get igv.js working idiomatically with React, I did the following:

  1. Updated package.json: change "main": "dist/igv.min.js" to "main": "dist/igv.es6.js", add "scripts": { "prepublishOnly": "grunt" }.
  2. Declared (an experimental) igv a dependency in package.json for igv.js-react.
  3. Imported igv in a React JS module.

Why is NPM so central to what you're doing?

It is central to ES6 export because the NPM package is what defines the file containing the exported module. More broadly, NPM is integral to this work because the goal is to ease use of igv for the most developers, and I think prioritizing support for the increasingly common and standard ES6 module system before the upcoming major release is the best way to achieve that for igv.js 2.x.

Per your suggestion, I've created a new GitHub issue at #701 to discuss this particular NPM package question.

... ES modules in the node environment? Particularly the requirement for an ".mjs" extension.

That would likely be best supported in a separate package, or via the bin parameter in package.json for the default igv NPM package. It could enable support for a command-line interface to igv.js.

jrobinso commented 6 years ago

@eweitz Thanks. RE NPM / module packaging I assume point 1 is what's required for igv, 2 and 3 were for your react test.

eweitz commented 6 years ago

Yes, that's right.

jrobinso commented 6 years ago

Scratch the mjs extension idea, see the comment here: https://github.com/infernojs/inferno/commit/96bc3af8278b6f46e1ae37fbba67cfcf1294086d

.esm.js seems good (rather than .es6.js).

eweitz commented 6 years ago

Either .esm.js or .es6.js would work! They're helpful developer hints. Good call on scratching .mjs.

romanzenka commented 6 years ago

I used the NPM igv package briefly while working on my React app, which is not public.

https://www.npmjs.com/package/igv

I would however have no problem switching to whatever newer system there is.

I eventually abandoned using that NPM and made my own personalized build of IGV that omits some parts my app does not need just to keep the amount of javascript down. I reduced the size of IGV dependency to about 50%. It is extra work and I need to keep updating my build as IGV evolves, but it made sense to me as the savings were substantial in my particular usecase.

eweitz commented 6 years ago

ES6 export is now available per #702.

I hope to have time later this quarter to implement CI, which I started over at https://github.com/eweitz/igv.js/tree/continuous-integration.

eweitz commented 6 years ago

I'll go ahead and close this, as igv.js now has continuous integration and ES6 export!