mihanovak1024 / programmers-guide-slovenia

Basic guide for new programmers in Slovenia 🇸🇮
194 stars 14 forks source link

Create Facebook-post parser #13

Closed mihanovak1024 closed 3 years ago

mihanovak1024 commented 4 years ago

Create a parser for Facebook posts or specific comments into a format for non-Facebook users.

Add both Facebook .direct link to a post and the parsed format path/HTML/...

matej2 commented 4 years ago

You might want to check this project i am working on.

mihanovak1024 commented 4 years ago

@matej2 I'm already doing my own parser in Python out of fun, since I never programmed in Python before and looks like a good challenge. If the final product won't as I expected, I'll just look into yours. But currently, I'd rather have my own. The progress can be seen on branch issue13/facebook-post-parser. Since I have some other stuff to do right now, the development is going to take a bit longer.

matej2 commented 4 years ago

I see you are making a browser scrapper.. This was my initial idea for my project. After few tries i found out that there is a high chance that Facebook will block account (even if it is confirmed with phone num). In this case you will have to fake headers, agent string and other parts of real request.

matej2 commented 4 years ago

@mihanovak1024 any updates on scrapper?

mihanovak1024 commented 4 years ago

@matej2 I forgot about this one. Will start working on it in the upcoming days.

matej2 commented 4 years ago

Some tips for this: Ask group admin to switch to a public group for a short time, while you scrape. This way you can use a proxy to scrape. Your best option is to use old Fb (which does not use JS). If thiss still gets you blocked, use a headless=false mode.

mihanovak1024 commented 3 years ago

https://github.com/mihanovak1024/fejstbukov-parser this is one way of parsing content for non-fb user accounts...

TODO:

matej2 commented 3 years ago

You could use parser in order to convert posts to markdown and then paste them in readme. I will see if I can help anywhere..

Edit: Oh, I didnt see that this is already in TODO..

mihanovak1024 commented 3 years ago

Closing as it violates Facebook policy