juliomalegria / python-craigslist

Simple Craigslist wrapper
MIT No Attribution
387 stars 117 forks source link

Count items returned from Craigslist locations shows tendencies for modulo 120 #68

Closed irahorecka closed 3 years ago

irahorecka commented 4 years ago

Hello Julio,

I noticed that python-craigslist returns items from Craigslist locations with a tendency to have posting counts that are factors of 120. To clarify, the maximum items returned from a Craigslist location for a category (e.g. apts/housing from santabarbara) is 3000. The 120 comes from the number of listings per page in Craigslist; meaning at maximum, a Craigslist category per location should have 25 pages.

My speculation is that the craigslist .get_results() module jams and exits when transitioning to a different listing page. I noticed this to be especially the case when geotag=True. I am currently looking at apts/housing in the United States through CraigslistHousing. Please see figure below that outlines this observation (i.e. the high peaks):

python-craigslist_post_freq

juliomalegria commented 4 years ago

To make sure I fully understand the graph, the X axis is continuous right? So it includes values like say 0.25 or 2.7 correct, and you're pointing out the peaks in integer numbers, which indicate that the results are mostly multiples of 120. Did I understand that right?

irahorecka commented 4 years ago

Yes, that is correct. To add some clarification, this occurs most frequently when geotag is set to True.

irahorecka commented 3 years ago

Hey, Julio - I'll close this issue for now, since I have not experienced this lately. I'll let you know if this is an active issue in the future.