Enhancement: separate pandas utils from generals Utils.py

ykud commented 4 years ago

Hi guys,

Not an issue, just a suggestion :) Would really be helpful to have pandas-related stuff separate from base Utils as it greatly increases executable when packaging a tm1.py based script in py2exe or pyinstaller. Packaging pandas take ~10 Mb and they cannot be excluded even if these methods are not used, as all the Service classes import Utils.* and therefore import pandas as well.

Thank you, Yuri

MariusWirtz commented 4 years ago

Hi @ykud,

that's a fair point. I suppose from a design perspective it would be cleaner that way. In the past, some users reported difficulties when installing TM1py on an environment without access to the internet. I suppose with a core TM1py package we would make their life easier as well.

Yes. perhaps we could extract all methods with a pandas dependency into a side project. Something like tm1py-pandas.

If we go down that road, we could build a dedicated Service to provides functionality for pandas data frames. So we could move some functions from the CellService, the Utils module and the new PowerBiService into this new project.

We would have to come up with a smart solution in order to not break backwards compatibility, for users that are currently working with the existing functions. Perhaps importing the new functions locally and executing them inside the old functions would do the trick (and throw a warning of course).

This would require a bit of refactoring. Perhaps this is a change we could aim for in the next major release (2.0).

What do you think? Would you be interested to contribute?

Any third opinions on this topic?

Cheers,

Marius

lieslberry commented 4 years ago

Hi Marius

I am not an expert and only now trying to learn Python, but isn't there a way to check if panda is installed and then import it if it exists?

Liesl

MariusWirtz commented 4 years ago

Hi @lieslberry,

At the moment we have a fix list of dependencies for TM1py. pandas is one of them. So now when you install TM1py with pip or conda it will always install pandas with all its dependencies (numpy, matplotlib, etc.) So in a normal setup pandasmust be installed.

The controversial point is that 95% of the TM1py module is not actually using pandas. So many TM1py users don't actually need it, but it may cause them problems, like when they want to package a python script into an executable.

If we don't declare the pandas dependency, but import pandas with like a try-and-error approach, the users who actually need it will need to go and import it on their own. That's not really nice neither.

ykud commented 4 years ago

Hi @MariusWirtz ,

I agree that a 'leaner' tm1py would be beneficial in the long run and all the additions like pandas and powerbi connectivity can be split into separate projects.

I think it should be a 'breaking' change of sorts as in it might be not that bad to 'sever' it cleanly (without reimporting pandas functions) as most people wouldn't be using pandas anyway and wouldn't notice and you'd want everyone using pandas to switch to "tm1py-pandas".

I would love to help, just think that this probably isn't the highest priority thing just yet, I'd rather spend some time finalising the work @rclapp started on Git services first.

I can live with largish .exe files generated at the moment, just thought it's an idea worth discussing.

DJHig commented 4 years ago

Hi @MariusWirtz , I would have liked pandas to not be a requirement as well for projects that do not need those features. My reason? It is rather difficult to get a docker build working from a base Alpine image when pandas is involved since those binaries are not always available in the alpine repository.

How about moving the pandas dependency to an optional "extras" (https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies) ? That way projects that need the pandas features can install TM1py with: pip install TM1py[pandas] or [dataframes] or whatever you want to call it.

MariusWirtz commented 4 years ago

Hi,

@ykud thanks for the feedback! It makes sense.

@DJHig I was not aware of those optional dependencies. I love it. I think we should go down this road.

cubewise-code / tm1py

Enhancement: separate pandas utils from generals Utils.py #214