Open jocalafe opened 6 years ago
I'm completely agree with 4 points. Should we split them in single issues and give them priority? Or you prefer work on it as a single big issue? @jocalafe
Move data like url and zones into either JSON files or JS modules (I think JSON is better in this case so that when we move to a db the transition is easier). I would make one of each per provider. (e.g. providers/fotocasa/data/zones.json)
👍
Split fotocasa and idealista into 2 separate modules, I think we can abstract all the common code into a generic provider (third module) that the specific providers will use to scrape using their selectors (could also be stored in data).
I think that here Generics will come in handy to create a well designed architecture.
We could also extract all the functions that print stuff to the console into a separate module, this will help us split the cli from the api when this is finally done.
I don't have very clear the idea behind this, but fine 👌
Adding tests should be critical in this stage to make the migration easier and more reliable.
This have a dep with #23, we should do this first.
Now that the scraper is "working" I think we should focus on making it as nice and clean before migrating it to a different scraping framework.
Thinks we can improve: