neokd / DataStorehouse

DataStoreHouse is an open-source project that aims to create a collaborative platform for gathering and sharing a wide variety of datasets. It provides a centralised repository where individuals and organisations can contribute, discover, and collaborate on diverse datasets for various domains.
https://datash.vercel.app
MIT License
18 stars 22 forks source link

Bug fix and list replaced with deque #115

Open Bchass opened 1 year ago

Bchass commented 1 year ago

Description

Fixed a bug and switched over to a deque instead of a list

Related Issues

https://github.com/neokd/DataStorehouse/issues/112

Changes Made

load_proxy_list() doesn't rely on an absolute path anymore, this wasn't working as intended. rotate_proxy() Switched over to a deque for better efficiency when rotating proxies

Screenshots (if applicable)

N/A

Checklist

Please review and check the following before submitting your pull request:

Additional Notes

I would like to add test cases for this

vercel[bot] commented 1 year ago

Someone is attempting to deploy a commit to a Personal Account owned by @neokd on Vercel.

@neokd first needs to authorize it.

neokd commented 1 year ago

Can the proxy be made like an parameter? Like if user set as True then proxy works or else it tries to scrape without proxy.?

Bchass commented 1 year ago

Can the proxy be made like an parameter? Like if user set as True then proxy works or else it tries to scrape without proxy.?

Are you referring to a user-agent a browser would use?

neokd commented 1 year ago

No, for now we were thinking that proxy is getting complicated so if we allow user to pass an argument that they want to use scrapper or not.

Bchass commented 1 year ago

No, for now we were thinking that proxy is getting complicated so if we allow user to pass an argument that they want to use scrapper or not.

I see. I'll look into adding that option for the scraper.

neokd commented 1 year ago

@Bchass looks good I'll review it fully and merge it. Thinking to publish the scrapper as package on Pypi?

Bchass commented 11 months ago

Is this project being maintained anymore?

neokd commented 11 months ago

@Bchass yeah the project is maintained. There was merge conflict so i didn't merge this PR.