theventurecity / data-toolkit

Data Pipeline Toolkit for Early-Stage Startups
MIT License
39 stars 15 forks source link

Cohort Analysis pipeline error #1

Open oriolvall opened 3 years ago

oriolvall commented 3 years ago

Hello,

Before I write the issue I'm encountering, I want to say thank you for creating and sharing this GitHub as it has been very helpful.

The error I have come up with is happening when trying to use the 'create_xau_cohort_df' function. When running the Mini-Pipeline: Cohort analysis. The error printed is the following:

'<' not supported between instances of 'pandas._libs.tslibs.offsets.MonthEnd' and 'pandas._libs.tslibs.offsets.MonthEnd'

Do you happen to know the solution to this problem?

Thank you in advance,

Oriol

dksmith01 commented 3 years ago

Oriol - I'm glad that you have found our Data Pipeline Toolkit and are trying to use it. And I'm sorry it's not working for you. The issue you are seeing has to do with the version of Pandas that you have on your machine. Unfortunately, this toolkit is two years old and Pandas has evolved since then and I haven't had a chance to keep pace.

If you are able to create a Conda or other type of environment that uses Pandas version 0.23.4, it should work for you.

You could also try to create a copy of this Colab notebook https://colab.research.google.com/drive/11xU3q7kTRs7hBbd5uSmiZeChKwaev88A and feed your data into it to test. Note that, at the top of the first Python code block in that notebook, it runs !pip install pandas==0.23.4. Please let me know what you try and if you're able to get it working. If necessary, we can schedule a Zoom to discuss it.

I'm curious: are you doing this for a startup? If so, we might be interested in talking to you about working with TheVentureCity. Check us out here https://theventure.city/.

Best, David

On Wed, Jun 9, 2021 at 6:25 AM oriolvall @.***> wrote:

Hello,

Before I write the issue I'm encountering, I want to say thank you for creating and sharing this GitHub as it has been very helpful.

The error I have come up with is happening when trying to use the 'create_xau_cohort_df' function. When running the Mini-Pipeline: Cohort analysis. The error printed is the following:

'<' not supported between instances of 'pandas._libs.tslibs.offsets.MonthEnd' and 'pandas._libs.tslibs.offsets.MonthEnd'

Do you happen to know the solution to this problem?

Thank you in advance,

Oriol

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/theventurecity/data-toolkit/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKOFHMLFALYWLAR5UTQHX3TR46TLANCNFSM46LXXXRA .

oriolvall commented 3 years ago

Hello David, upon using an older version of pandas, the issue raised is that some other functions cannot be used like map, getting the following error:

cannot import name 'map' from 'pandas.compat'

If by any chance you would happen to know the solution, It would help incredibly,

Thank you very much,

Oriol

dksmith01 commented 7 months ago

@oriolvall I know it's been a long time since your initial request, but the Data Pipeline Toolkit has been updated with the latest versions of Pandas and other libraries so that it works again.