Closed jfsiii closed 6 years ago
I would love to move to d3 v4, but I know it would be a big task. I will have a quick go at upgrading on my branch to gauge how hard it would be. I would also love dc to take some of the best practices that d3 now uses:
D3 4.0 is modular. Instead of one library, D3 is now many small libraries that are designed to work together. You can pick and choose which parts to use as you see fit. Each library is maintained in its own repository, allowing decentralized ownership and independent release cycles.
You can also cat D3 microlibraries into a custom bundle, or use tools such as Webpack and Rollup to create optimized bundles. Custom bundles are great for applications that use a subset of D3’s features; for example, a React chart library might use D3 for scales and shapes, and React to manipulate the DOM. The D3 microlibraries are written as ES6 modules, and Rollup lets you pick at the symbol level to produce smaller bundles.
The default UMD bundle is now anonymous. No d3 global is exported if AMD or CommonJS is detected. In a vanilla environment, the D3 microlibraries share the d3 global, even if you load them independently; thus, code you write is the same whether or not you use the default bundle. (See Let’s Make a (D3) Plugin for more.)
In short: modular, extendable and anonymous.
Nods. All good things.
Actually I don't think this is the most difficult task ahead for d3 - some of the design and debugging problems are really fiendish.
And I think it's a pretty safe change.
However, I do want to watch the adoption of v4 because it is likely there will be some gotchas and best practices to be learned as it rolls out.
(Does modularity require splitting dc.js into a dozen or more repos? Because that will definitely increase the cost of maintenance.)
Presumably the modularity just translates to "simply include the parts of d3 that dc.js requires," without any sort of funkiness on dc's side.
I'm not sure if there's any part of dc.js itself that would really make sense to split off into a separate bit -- anonymous-ness, perhaps, though.
I think modular charts would be great. If I only want to have a bar and pie chart then I only need to include those, not the whole library. Thought at 82.6 KB minified it's not that bad, but going forward adding more charts to dc will make it bigger.
I tested d3 v4 on my branch and managed to fairly easily get something to work. Most things are just renames (e.g. d3.time.format
to d3.timeFormat
)
However it seems that d3.layout.stack
(now d3.stack
) has changed quite a bit so there is no data rendering. Though as @gordonwoodhull says there will be more gotchas during the implementation.
The idea that plugins are the same as built-ins is really good, too.
I mean you can already plug in most chart PRs safely, but it's awkward and requires more git-fu than most people need. I think if we adopt the same architecture then we can just have a gallery of contributions and when they get to a certain level (usually involving tests, unfortunately) they can migrate to the main repo or organization. It should help with getting more feedback as the charts evolve.
I think there are breaking changes in every d3 component, even d3.selection is a little different IIUC.
That's regular migration stuff.
I'm more worried that we understand the right way to include modules and watch as best practices for modules and plugins emerge. This, for example --
https://github.com/d3/d3/issues/2793#issuecomment-229074764
-- is still Greek to me, but I'm sure in a couple of months it will make perfect sense!
Huh, modularity would indeed be useful for third-party charts & PRs & whatnot. Hadn't thought of that.
Dunno if I have anything else to add, but there you go.
So what is the plan then? Use @jfsiii suggetion of finishing off dc v2 using d3 v3 and then create dc v3 using d3 v4? Would upgrading to d3 v4 warrant an almost complete re-write?
Something like that. I've been putting breaking changes into 2.1 (semantic version without quite so much inflation), but I don't really care if we call it 2.2 or 3.
I doubt it's a rewrite as there are a lot of subtle details to the way the code is now. Of course if you want to write dc.js from scratch, no one's stopping you, but I'd rather see this repo evolve... Many chart libraries have died from trying to change everything at once.
I'd also rather port to the v4 interface before modularizing, if we do break dc.js up into lots of repos: it will become a lot more difficult to port fixes, and I think there is value to fixing the hundred or so issues identified as interface - compatible with 2.0.
Perhaps I am too conservative. Unfortunately this is not a full-time gig for me.
I was playing around with Angular 2 the other day and implementing the material design components - https://github.com/angular/material2. The whole thing is modular but it is all still housed in the one repository. Then each area has it's own guide (e.g. https://github.com/angular/material2/tree/master/src/components/input). I found this much easier to use as all the code is still in one place, but it is organised and it even means the documentation is modular (unlike the dc documentation which is bigger than Ben-Hur).
Cool! I will check it out.
BTW, thanks to @mtraynham, we now have HTML documentation which you may find more navigable. We should advertise it better!
Suggestions welcome.
I've put together this table which should help identify all uses of d3 in dc.js and references to the d3 4.0 counterpart.
Note: This table does not currently outline any inconsistencies and breaking changes, but I will add them if they are identified.
Here is the change log, for anyone who is interested. There are minor breaking changes to every single module, it seems:
https://github.com/d3/d3/blob/master/CHANGES.md
Warning: it's a long read!
This seems to be the best overview and rationale of the major changes: https://medium.com/@mbostock/what-makes-software-good-943557f8a488#.m7uanuhgo
You can skip the philosophizing at the beginning and go straight to the cases, which are the real point.
The biggest change is case 1: enter().append()
no longer modifies the update selection. It will take some staring at our code to determine where we rely on this.
And that's the point: it wasn't explicit before, it was hidden. Our code will become clearer by explicitly merging selections. But we have to read it carefully to figure out where we need to.
Looks like brushes might be one of the biggest changes that affects dc.js; some hints in this comment: https://github.com/d3/d3-brush/issues/14#issuecomment-250993767
It's been a couple of months since this was first brought up 😄. What's your thoughts about updating to use v4 of d3, @gordonwoodhull ?
I'd love to do it, however it's a ton of work. I'm not concerned about the component renames, but the APIs and semantics of pretty much every component have changed, as detailed in the CHANGES doc I linked above.
If anyone wants to start working on this in a fork, that would be excellent.
In terms of versioning, I figure I want to first pull in all the big pull requests into dc.js 2.1, and then port them to d3v4 in dc.js 3.0.
I have a bunch of branches where I fixed the worst problems in the 2.0 betas, and in most cases I discovered that I couldn't fix them without changing the API, colors, or behavior in incompatible ways. So I'm inclined to just go ahead and release 2.0 pretty much as it is, and merge those fixes for 2.1.
Shameless plug here - but started a new approach (based on Polymer, d3.v4 and Universe) available here for very early(!) preview.
The idea is to use the modular nature of Polymer to construct dc-like charts with web-components (i.e. handle building blocks directly in the markup - and importing only relevant components).
There are two independent libraries: multi-verse
for handling filtering/grouping of dataset, and multi-chart
for rendering charts and handling selection (brush or click behaviors).
One simple example looks like:
<!-- Load the data -->
<multi-csv url="flight.csv" data="{{data}}"></multi-csv>
<!-- Start a multi-verse (similar to creating a new crossfilter)-->
<multi-verse id="universe" data="[[data]]" universe="{{universe}}">
<!-- Group the data by distances, count only)-->
<multi-group universe="[[universe]]" data="{{data-chart-distance}}" group-by="distances">
<!-- Render this group in a bar chart-->
<multi-verse-bar title="distance" data="[[data-chart-distance]]"> </multi-verse-bar>
</multi-group>
<!-- Group the data by day-->
<multi-group universe="[[universe]]" data="{{data-chart-day}}" group-by="day">
<!-- Render this group in a pie chart-->
<multi-verse-pie title="day (pie)" data="[[data-chart-day]]" color-scale="{{colorScale}}" width="{{width}}">
<!-- Add a color scale legend to the chart -->
<multi-legend legend chart-width="[[width]]" scale="[[colorScale]]" position="top-right"></multi-legend>
</multi-verse-pie>
</multi-group>
</multi-verse>
and a more mature example:
Agree, not an easy task to V4. Do not know how to move the a.functor variable.
Hi @jakobzhao, d3.functor
isn't all that complicated and it's not hard to replace.
https://github.com/d3/d3/blob/master/CHANGES.md#internals
It will be a big task to convert dc.js to d3v4 but not that hard I think. Just will take a lot of time to sort out all the subtly changed functionality. I hope to get to it this year. If anyone makes partial progress on this please lmk.
Any update on progress supporting d3v4? Thanks.
I'd really like to do this, and I've been learning all I can about the differences - just need to find 2-3 weeks of solid time to actually do the work. :-/
Don't stress too much about it Gordon. Open source projects with only a few maintainers like this, it's hard to adopt stuff fast. If I'd have more time away from my actual job I'd be willing to help, but I won't promise stuff that I can't keep.
agree, don't stress too much!
Is there any branch where anyone has taken any initial steps at implementing this? Just curious if there is some place I can muck around and potentially contribute. It would really be nice to switch to d3v4, but I understand how large a change that is for dc.js and it is pretty daunting to tackle it when you don't quite understand the inner workings of a good chunk of the library :-)
Please check #1363 for current progress.
I have started work on this task. My current understanding:
My overall attach direction is as follows:
Well I took a plunge (#1363), following is current status:
Updated version in package.json
to use d3 v4.
Created 3 files (in src/dc-compat):
Limited tests to just pie charts
Only one change was needed in spec file:
expect(d3.select(chart.selectAll('g.pie-slice path')[0][0]).attr('fill')).toMatch(/#3182bd/i);
to
expect(d3.select(chart.selectAll('g.pie-slice path').nodes()[0]).attr('fill')).toMatch(/#3182bd/i);
As of now all test cases for pie chart are passing.
My next course of action:
We are really close to a release. The final thing that needs to be ported is d3.layout.stack
-> d3.stack
, which has a completely different API.
I encourage people to try the 3.0 branch - once this goes beta we'll start publishing to NPM.
All porting to d3 v4/5 is complete and merged to develop/master.
Published 3.0.0-beta.1 to npm.
Only remaining known issue caused by the port is that the range-series example crashes #1424. We need to deal with this in a comprehensive way which will change some assumptions (#1408)
Working on a porting guide here: https://github.com/dc-js/dc.js/wiki/Changes-in-dc.js-version-3.0/_edit
Meanwhile, most of the changes are in Changelog.md.
D3 is being broken up into smaller modules for v4 (e.g.
d3-scale
,d3-selection
, etc) and there are some API changes.Discuss version support & approach. Should dc.js support d3 v3 and v4 or only one of them? One option to support both is a jQuery-style approach where 2.x supports v3 and 3.x supports v4.