viralogic / py-enumerable

A Python module used for interacting with collections of objects using LINQ syntax
MIT License
188 stars 24 forks source link

Speed #74

Open davepruitt opened 2 years ago

davepruitt commented 2 years ago

First let me say that I really love this package that you've developed. I think using LINQ-style syntax is a lot more intuitive and easier to read/understand than using Python list comprehension.

As an example, doing this:

filtered_times = Enumerable(all_times).where(lambda x: ((x >= start_time) and (x <= end_time))).to_list()

is more readable and intuitive than this:

filtered_times = [x for x in all_times if ((x >= start_time) and (x <= end_time))]

My one issue is with regard to speed. I've noticed that list comprehension is orders-of-magnitude faster than using py-enumerable.

Is there any way to speed up the performance of py-enumerable?

viralogic commented 2 years ago

Hey! Thanks for the comment. I really like the LINQ syntax as well.

In terms of speeding up the performance to make it comparable to list comprehensions, I think some major re-architecting would need to be done. This is something I have thought about also. I think this would definitely be something to put as a roadmap item for this project to investigate and implement, but I don't have an immediate solution for you.

viralogic commented 1 year ago

Just going to create some benchmarks around this issue and compare with list comprehensions so that there are metrics around what sort of performance gains need to be achieved.

shayneoneill commented 1 year ago

If you want to speed this up, you'll probably need to rearchitect it to use lazier evaluation.

Theres some good hints in here: https://stackoverflow.com/questions/39154269/python-linq-capable-class-supporting-simultaneously-laziness-and-fluent-design

To be clear, its probably necessary for any larger dataset or iterating over IO.