chris-greening / instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
https://chris-greening.github.io/instascrape/
MIT License
633 stars 109 forks source link

Lazy iterator for post discovery #127

Open stefco opened 3 years ago

stefco commented 3 years ago

Is your feature request related to a problem? Please describe.

When iterating through a Profile's posts, one does not always know how many posts one needs beforehand. Since posts are fetched iteratively and with high network latency, it makes more sense in such cases to lazily iterate through posts. This also simplifies the implementation of the post fetcher by pushing control flow up to the caller, where itertools.islice and the like can shape the data properly with zero performance penalty versus the existing solution (and indeed with a significant speedup in cases where the required number of results is not known ahead of time). This is particularly useful when performing many differential updates on accounts with large numbers of posts, where a lazy iterator provides a natural way of expressing a performant solution.

Describe the solution you'd like

I would like to merge my iter_posts method, which implements a lazy iterator over posts. It additionally re-implements the eager get_posts method with far less code as an eager collect on the iterator returned by the new iter_posts method.

Describe alternatives you've considered

Writing convoluted (and still suboptimal) iterator logic that would be more confusing and unmaintanable than the presented iter_posts method.

Additional context

N/A