Closed sckott closed 4 years ago
thoughts @ethanwhite ?
I think just returning dicts is good. Easy enough to convert to dataframes if the format matches what Pandas takes.
thanks! will do then
Awe. But I like Pandas!
doesn't it seem like a heavy dependency though?
It is, darn it. Is the use-case of pytaxize without Pandas particularly appealing? As long as it still plays nice with Pandas by default, I suppose it doesn't need to be required per se.
It also depends on how closely you want it to reflect the R package. If you want it be pretty close, DataFrames are the logical way to make Python behave in an R-ish kind of way.
Indeed, the R client is focused on data.frame outputs, so in that way using pandas makes it more similar to that. How common will it be for the same person to be switching between taxize and pytaxize? Or is the most common case any one person uses one or the other?
Do you think returning dicts is the most idiomatic python? That is, is that what someone would expect to get back by default?
Dropping pandas makes this library simpler and easier to maintain, so that's a + for dropping
Thanks for creating this package, I just came across it today.
I definitely see the case for making the library simpler and easier to maintain but my personal preference would be to maintain the option of getting a pandas dataframe returned if at all possible. I think it is becoming idiomatic, especially in the python data science community, to use pandas dataframes as inputs and returns for functions.
If I can be of assistance in maintaining or testing, please let me know.
@ColinTalbert thanks for your thoughts. Two votes to keep pandas, so i'm willing to keep it. Maybe it makes sense to allow user to say whether they want a pandas dataframe back or a dict?
If I can be of assistance in maintaining or testing, please let me know.
Thanks for the issue #43 - that kind of help is great! using and reporting bugs. Also, any feature requests is great. There are still many things to port over from taxize
- so this client is a ways off from being complete at least wrt its R cousin
One option I've seen and liked is including an optional parameter to specify the return format: https://github.com/ulmo-dev/ulmo/blob/0c6337993542f7a46ed9d528ea9046628cd9f8bc/ulmo/nasa/daymet/core.py#L66
If you used this I think you could also set up the pandas import as optional, which might might make installation easier for folks who aren't using pandas (ArcMap users, perhaps)
Colin Talbert GIS Analyst and Developer US Geological Survey
talbertc@usgs.gov
USGS Fort Collins Science Center 2150 Centre Ave. Bldg. C Fort Collins, CO 80526 (970) 226-9425
USGS North Central Climate Science Center (970) 492-4283
Work schedule: Monday - 7:00 - 3:00 (FORT) Tuesday - 7:00 - 3:00 (NC CSC) Wednesday - 7:00 - 3:00 (FORT) Thursday - 7:00 - 5:00 (FORT) Friday - 7:00 - 5:00 (FORT)
On Wed, Oct 26, 2016 at 10:41 AM, Scott Chamberlain < notifications@github.com> wrote:
@ColinTalbert https://github.com/ColinTalbert thanks for your thoughts. Two votes to keep pandas, so i'm willing to keep it. Maybe it makes sense to allow user to say whether they want a pandas dataframe back or a dict?
If I can be of assistance in maintaining or testing, please let me know.
Thanks for the issue #43 https://github.com/sckott/pytaxize/issues/43 - that kind of help is great! using and reporting bugs. Also, any feature requests is great. There are still many things to port over from taxize - so this client is a ways off from being complete at least wrt its R cousin
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sckott/pytaxize/issues/35#issuecomment-256407329, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJURIu18lCaSa_nKQnK8EYkLal8mAbvks5q34LIgaJpZM4Hbdzo .
As a parameter sounds good to me! ✔️
If you used this I think you could also set up the pandas import as optional
how do you do optional imports in a python library?
something along the lines of:
try: import pandas as pd
except ImportError: pd = None
Colin Talbert GIS Analyst and Developer US Geological Survey
talbertc@usgs.gov
USGS Fort Collins Science Center 2150 Centre Ave. Bldg. C Fort Collins, CO 80526 (970) 226-9425
USGS North Central Climate Science Center (970) 492-4283
Work schedule: Monday - 7:00 - 3:00 (FORT) Tuesday - 7:00 - 3:00 (NC CSC) Wednesday - 7:00 - 3:00 (FORT) Thursday - 7:00 - 5:00 (FORT) Friday - 7:00 - 5:00 (FORT)
On Thu, Oct 27, 2016 at 9:56 AM, Scott Chamberlain <notifications@github.com
wrote:
If you used this I think you could also set up the pandas import as optional
how do you do optional imports in a python library?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sckott/pytaxize/issues/35#issuecomment-256687584, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJURDzHLtUqSCCo2P6JxWvSf4L4GAfJks5q4Mm_gaJpZM4Hbdzo .
and then in in code:
if not pd: raise Exception("Pandas need for this functionality")
Colin Talbert GIS Analyst and Developer US Geological Survey
talbertc@usgs.gov
USGS Fort Collins Science Center 2150 Centre Ave. Bldg. C Fort Collins, CO 80526 (970) 226-9425
USGS North Central Climate Science Center (970) 492-4283
Work schedule: Monday - 7:00 - 3:00 (FORT) Tuesday - 7:00 - 3:00 (NC CSC) Wednesday - 7:00 - 3:00 (FORT) Thursday - 7:00 - 5:00 (FORT) Friday - 7:00 - 5:00 (FORT)
On Thu, Oct 27, 2016 at 9:58 AM, Talbert, Colin talbertc@usgs.gov wrote:
something along the lines of:
try: import pandas as pd
except ImportError: pd = None
Colin Talbert GIS Analyst and Developer US Geological Survey
talbertc@usgs.gov
USGS Fort Collins Science Center 2150 Centre Ave. Bldg. C Fort Collins, CO 80526 (970) 226-9425
USGS North Central Climate Science Center (970) 492-4283
Work schedule: Monday - 7:00 - 3:00 (FORT) Tuesday - 7:00 - 3:00 (NC CSC) Wednesday - 7:00 - 3:00 (FORT) Thursday - 7:00 - 5:00 (FORT) Friday - 7:00 - 5:00 (FORT)
On Thu, Oct 27, 2016 at 9:56 AM, Scott Chamberlain < notifications@github.com> wrote:
If you used this I think you could also set up the pandas import as optional
how do you do optional imports in a python library?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sckott/pytaxize/issues/35#issuecomment-256687584, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJURDzHLtUqSCCo2P6JxWvSf4L4GAfJks5q4Mm_gaJpZM4Hbdzo .
ah, okay
I've taken a stab at implementing this, take a look at: https://github.com/ColinTalbert/pytaxize/blob/as_dataframe/pytaxize/itis.py#L9-L13 and: https://github.com/ColinTalbert/pytaxize/blob/as_dataframe/pytaxize/itis.py#L71-L123 for how I was thinking this would look.
There are obviously many functions in the itis module that would need this change, let alone the modules for other services. If this seems like a workable design pattern, I'll see what I can do to add this functionality to the rest of the itis functions, before submitting a pr.
thanks @ColinTalbert - looks good, though can we shorten the param as_dataframe
to dataframe
, just for brevity sake
i realized that there's at least some, didn't look thoroughly through package, internal munging of data with pandas
- which I think should be changed to just munging native python data structures - so we'll need to do that too
don't we also need extra_requires
or similar in setup.py
?
Yes, good point. I'll throw that in as well.
Colin Talbert GIS Analyst and Developer US Geological Survey
talbertc@usgs.gov
USGS Fort Collins Science Center 2150 Centre Ave. Bldg. C Fort Collins, CO 80526 (970) 226-9425
USGS North Central Climate Science Center (970) 492-4283
Work schedule: Monday - 7:00 - 3:00 (FORT) Tuesday - 7:00 - 3:00 (NC CSC) Wednesday - 7:00 - 3:00 (FORT) Thursday - 7:00 - 5:00 (FORT) Friday - 7:00 - 5:00 (FORT)
On Fri, Oct 28, 2016 at 12:38 PM, Scott Chamberlain < notifications@github.com> wrote:
don't we also need extra_requires or similar in setup.py ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sckott/pytaxize/issues/35#issuecomment-256996266, or mute the thread https://github.com/notifications/unsubscribe-auth/ACJURLFyDCqG4sJlFgjMfMRuTTOR5eyCks5q4kEPgaJpZM4Hbdzo .
Why is this thread still open? :( Why can't we sort this out? has there been a pull request sent?
@dheerajpai if @ColinTalbert send a PR we can discuss that. if we don't hear from him soon I can go ahead and do it
very heavy dependency, and probably more idiomatic python to just return dicts