ManiMozaffar / linkedIn-scraper

A playwright bot which is implemented to scrape linkedin and store advertisement data in a database and telegram channel
212 stars 22 forks source link

ChatGPT exported AD vs Real AD on linkedin #6

Closed saharsdr closed 1 year ago

saharsdr commented 1 year ago

Hello, i saw an ad in telegram bot that requrement and text was about some technology but nothing of them was in real lnkedin AD.

Persian: محتوای تبلیغ موجود در تلگرام، به محتوای تبلیغ اصلی در تلگرام بی شباهت بود. برای مثال، تبلیغ لینکدین درمورد یک محقق تجربه کاربری بود ولی در کانال نوشته شده بود که نیازمند ۳ سال تجربه ی پایتون و یکی از فریم ورک هاست، نیازمند یکی از پایگاه داده هاست، ویزا اسپانسر شیپ دارد. در تبلیغ اصلی هیچکدوم از اینها بیان نشده بود.

ManiMozaffar commented 1 year ago

Hi, Thank you for reporting it, I will investigate it and work on fixing it as soon as possible.

ManiMozaffar commented 1 year ago

@saharsdr

Hi sahar, I think the mentioned bug should have been fixed with last commit, I just pushed it. please provide me feedback after sometime.

ManiMozaffar commented 1 year ago

I can see that the issue is still persistent in 10-15% of ads, chatgpt is too dumb :(( If someone has an idea to do PR to solve the issue, please feel free to do.

ManiMozaffar commented 1 year ago

8

I'll have to monitor the outcome, hopefully it'll fix the issue. Anyone tending to do PR to make it better, don't hesitate

ManiMozaffar commented 1 year ago

13 After doing some more work around and some benchmarking I did in my own database, it turns out that performance is increasing. Notice that this is an AI controlled system, and it's not a fined tunned models that is trained for this purpose, so this i believe is the best free solution is what i'm doing right now. In future I keep improving the prompts to reach to better result but there's always chance of bad analysis from chatgpt, which is now 1-3%. If we get some kind of donations or sponsorship, I can train my own model using fine tunned modeling in OpenAI, but for now that's not gonna be an option.