dmmarmol / pesapi

Playground project for developing a basic API for webscraping using Node.Js with Async&Await
9 stars 2 forks source link

get player #19

Open elinardo10 opened 4 months ago

elinardo10 commented 4 months ago

hi. can't get all players. just get for me the first player of table from site. whats can i do? can help-me? thanks

dmmarmol commented 4 months ago

Hey, thank you very much for your interest on this project!

As stated in the main README.md file, this was a Proof of concept back there and was unfinished by me at that time. However, I'm glad to provide you some insights if you wanna give it a try or maybe submitting a PR in the future if you desire.

Thanks to your comment, I've recently pushed an update to the package.json dependencies and gave it a try to run the main script again.

I'll suggest to pull the latest changes from master and then installing all the dependencies through yarn. Lastly, run yarn start once again.

The crawler script is not optimized at the point where, in order to write a json output, it does need to finish the WHOLE players crawling on the pesdb.net website. This is definitely something to improve by writing a JSON file (or a database, maybe) right after each player crawl

elinardo10 commented 4 months ago

This project is very cool and useful. I made the updates here, but I still not understand how you navigate since you if you don't use puppeteer. I've never actually used it either, but I've studied more about it. Suddenly I'm going to adapt what you already have to it. After your update it seems to have become a little more complicated because now it is no longer creating the array that it used to create. it was creating an array of players with only the first one being included inside. Now they're just reading it but it's not catching on. I'm interested in this pesdb database to start a game of developing an app to organize digital football, pes and fifa championships. I'm still going after the FIFA database. but you already got it

I also noticed that it is no longer assembling the array with players, it is assembling the array with each one's url and no more details that were previously very cool, which was the array of objects. each object had the player details, name etc... I'll try to study the code to see if I can help with anything

dmmarmol commented 4 months ago

Sorry for any inconvenience I may have caused. Your motivations were the same I had back then when I attempted this DEMO.

Couple of years ago I've reached the owner of pesdb through facebook to ask if he/she had a public api to share with the world to avoid having to scrap the whole website. Indeed he/she had one but wasn't open to share it so crawling was the last available resource.

As far as I remember, the idea behind this project was to crawl every player from every page at pesdb.net and then save each player inside a JSON file. The mid term goal was to have a relational or non-relational db behind where we can save directly every player right after the crawler has the info.

I haven't used puppeteer before either but it'd be a great addition to this project if you desire to fork it and give it a try!.

I'm really open to any kind of colaborations for this to make it work.

elinardo10 commented 4 months ago

thanks. I'll try it and send something if I succeed. Your script is practically ready. There's just a few details missing. with porpptier I saw that it is much easier.