javascriptdata / danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
https://danfo.jsdata.org/
MIT License
4.77k stars 209 forks source link

[Feature Request] minimal installation - optionally drop large libraries such as plotly #389

Open yeus opened 2 years ago

yeus commented 2 years ago

Is your feature request related to a problem? Please describe. danfojs is too large to include it in PWA/SPA applications. It would help if we could at least exclude some of the large dependencies. Or make danfojs modular so that we don't have to import the entire library at once. Similar to lodash.

Describe the solution you'd like Somehow decrease the size in order to make danfojs more usable in client-side applications ;). Best case would be a modular library similar to lodash's solution.

Describe alternatives you've considered Doing the calculations serverside. But there are some huge advantages of doing calculations client-side, among them are configurability and especially (as in my use case) privacy regulations which can be circumenvented by doing calculations client-side.

Additional context I have uploaded a small analysis of my SPA. danfojs is responsible for tensorflow and plotly, mathjs, xlsx packages and several other packages. It would be great

image

risenW commented 2 years ago

@yeus This is a good suggestion, and something I have also considered. We'll do some more analysis on this.

yeus commented 2 years ago

I have isolated the danfojs dependencies here:

image

I am able to use webpack to lazily load danfojs dependencies in the background using my own configuration for the "splitChunks" plugin (https://webpack.js.org/plugins/split-chunks-plugin/). But it would still be great if I didn't have to load plotly etc... ;).

you can see that just by dropping plotly, xlsx you can alrady get rid of more than 50% the package size...

I assume @tensorflow/mathjs isn't possible to drop, although that would be great as well. I don't see mathjs really necessary for dataframes...

risenW commented 2 years ago

I have isolated the danfojs dependencies here:

image

I am able to use webpack to lazily load danfojs dependencies in the background using my own configuration for the "splitChunks" plugin (https://webpack.js.org/plugins/split-chunks-plugin/). But it would still be great if I didn't have to load plotly etc... ;).

you can see that just by dropping plotly, xlsx you can alrady get rid of more than 50% the package size...

I assume @tensorflow/mathjs isn't possible to drop.

Thanks for sharing this. We'll move Plotly and XLSX to peer dependencies.

dvdjhys commented 1 year ago

Hello folks,

Before commenting I just want to say that I love this package and thank you for the effort that has gone into it.

We have included it in some new development and make use of the DataFrame API, to process some data, but don't require the plotting capability. Unfortunately our build times have taken quite a hit and we believe the size of plotly, etc that are contributing to a webpack slow down.

So I was just wondering if anything had happened to move plotly etc to peer dependencies? I think the answer is no but I thought I'd check to see if I had gotten the wrong end of the stick.

Cheers & Thanks

bml1g12 commented 5 months ago

Is it possible to move tensorflow also to peer dependency as I think it's not needed for basic dataframe operations? EDIT: I gather it's being used for even basic operations, so I guess that's out of the question

HassanAtWecrunch commented 4 months ago

Hi Folks, Any new info regarding plotly drop? Thanks.