digitalmethodsinitiative / zeeschuimer

A browser extension to collect social media data with.
Other
184 stars 14 forks source link

LinkedIn search does not capture posts #36

Open diana-tircomnicu opened 5 months ago

diana-tircomnicu commented 5 months ago

When attempting to capture from LinkedIn, I can do so for posts on my feed. However, when I search a hashtag and select posts, there is no file created containing the posts identified in LinkedIn search. Can this be fixed?

dale-wahl commented 5 months ago

Which version of Zeeschuimer are you using? Lastest is 1.10.1 which should be visible on the main tool page.

Zeeschuimer captures LinkedIn posts (not groups, jobs, people, etc.). I just tested and it is collecting when I search for "artificial intelligence". It first collects the few posts on the search page, then collects the rest when I click "see all post results". Hashtags work similarly.

diana-tircomnicu commented 5 months ago

Hi Dale. I have reinstalled the plug in and retried but I get the same result. It can see posts on the main feed and collects the info while scrolling, but search results are not collected. Would it help if I sent you any screenshots?

dale-wahl commented 5 months ago

It could not hurt! Point out to which posts are missed. It also should not recapture a post already in the dataset, if that happens to be the case (e.g., you already collected a post elsewhere).

diana-tircomnicu commented 5 months ago

Zeeschuimer

Hi Dale. I have added an image with screenshots and explaining the process. I capture the feed, then I search for a #. Upon looking through search results, it does not record the few posts displayed. When going into post results only, it does not record again. I have tried starting to capture from post search results only, with the same outcome: there were 0 posts captured.

dale-wahl commented 5 months ago

Thank you for the detailed response. Frustratingly, I can follow those steps and collected posts. My best guess at the moment is that LinkedIn has changed their site but it is not rolled out everywhere (the only visual difference I see is that my "On this page" is ordered differently). I will try a VPN and see if I can reproduce your issue and get back to you.

dale-wahl commented 5 months ago

OK, I think I can recreate it, but only on the occasion. In the meantime, try refreshing the page after you have searched; that seemed to collect for me. Will update you when we find a fix.

dale-wahl commented 5 months ago

Ok, well, I thought I was missing posts, but it turns out that LinkedIn will "lazy load" posts that it has already seen/shown in your current session. So when I would search again after deleting the collected posts they would not be collected (because they are already loaded). You can forcing a reload via Ctrl+F5 or Shift+Command+R and they are collected for me. I am not sure if that is your issue or if you are just encountering a different version of their site that I do not have access to troubleshoot.

diana-tircomnicu commented 5 months ago

I do not think this is my problem unfortunately. I have tried various methods for capturing the posts after search and it does not work. I'll keep an eye on here to see if it ends up being solved. Thanks for your support.

cringefauna commented 5 months ago

I have the same problem when scrolling posts on specific profiles. I'm located in EU if that add any value to know. I have tried hard reloading, opening directly from the hyperlink in extension site... If i try a VPN, do you have any suggestions on where to set it, to not encounter problems with potential changes to LinkedIn?

dale-wahl commented 5 months ago

It was working in Amsterdam as of yesterday (and via VPN to the UK). They are likely doing some A/B testing of different versions, but that's just a guess. We will keep checking and see if we cannot identify the issue.

cringefauna commented 5 months ago

thank you so much. Ehe, it is frustrating to know that it works for you, making me doubt if i do it correctly. But thank you, hoping for an update at some point :)

dale-wahl commented 2 months ago

Updates to LinkedIn search were pushed in the latest version. Did this address your issues?

luuislanda commented 1 month ago

Updates to LinkedIn search were pushed in the latest version. Did this address your issues?

Not OP but Zeeschuimer is now able to collect the posts and the ndjson file looks fine, however, whenever it is sent to 4CAT it get stuck. 4CAT logs say:

INFO at processor.py:159: Running processor linkedin-search on dataset ebe24e79fe3d50de7b8b83aef104ad60 INFO at search.py:66: Querying: {'datasource': 'linkedin', 'file': '/usr/src/app/data/linkedin-dataset-ebe24e79fe3d50de7b8b83aef104ad60.importing'} ERROR at search_linkedin.py:82: Processor linkedin-search raised TypeError while processing dataset ebe24e79fe3d50de7b8b83aef104ad60 in search.py:83->search.py:349->search.py:184->processor.py:754->search_linkedin.py:82: 'NoneType' object is not subscriptable

luuislanda commented 1 month ago

Updates to LinkedIn search were pushed in the latest version. Did this address your issues?

Not OP but Zeeschuimer is now able to collect the posts and the ndjson file looks fine, however, whenever it is sent to 4CAT it get stuck. 4CAT logs say:

INFO at processor.py:159: Running processor linkedin-search on dataset ebe24e79fe3d50de7b8b83aef104ad60 INFO at search.py:66: Querying: {'datasource': 'linkedin', 'file': '/usr/src/app/data/linkedin-dataset-ebe24e79fe3d50de7b8b83aef104ad60.importing'} ERROR at search_linkedin.py:82: Processor linkedin-search raised TypeError while processing dataset ebe24e79fe3d50de7b8b83aef104ad60 in search.py:83->search.py:349->search.py:184->processor.py:754->search_linkedin.py:82: 'NoneType' object is not subscriptable

* 4CAT Version: 1.45

* Zeeschuimer: 1.10.4

Just want to quickly jump in to say that updating to Zeeschuimer 1.11.0 and 4CAT 1.46 has solved this issue

Thank you!