fabiogiglietto / CooRnet

Given a set of URLs, this packages detects coordinated link sharing behavior on social media and outputs the network of entities that performed such behaviour.
MIT License
72 stars 15 forks source link

get_ctshares() issue #34

Closed aqibufu closed 1 year ago

aqibufu commented 1 year ago

Hi, Contributors of CooRnet, thank for your excellent work of CooRnet package. But I seem to meet a bug or problem. When I use the function get_ctshares() to get the sharing data, sometimes the process will be stuck in no warning or error. I have met this problem three time. First time, I stop the process and try it again. Second time, I stop the process and the R warning that "R is not responding to your request to interrupt processing so to stop the current operation you may need to teminate R entirely". And I select No, and the process was work again and continue to get the sharing data. But this time, I met this problem again, and it doesn't work again when I try to do the same thing as second time. I don't know why I met such kind of issue, maybe it's the API problem? But I have done another analysis before by using CooRnet, the api was work normal. Maybe you can help me work this out? I can simple stop the process and try again. But that will waste many time. Thank!

fabiogiglietto commented 1 year ago

Hi :) the issue you describe may be realated to a lack of resources on your machine. Unfortunately r doesn't handle this cases quite well and just hangs. Could it this be the case? How many URLs are you starting from?

aqibufu commented 1 year ago

Hi :) the issue you describe may be realated to a lack of resources on your machine. Unfortunately r doesn't handle this cases quite well and just hangs. Could it this be the case? How many URLs are you starting from?

Hi, Fabio. Considering that our data set is relatively large, I divided the data set into several parts. The first part is the links from 60,000 facebook posts. Actually,I have tried another dataset before, which is links from about 120,000 Facebook posts. But the urls sharing data of this dataset was successfully collected by get_ctshares() function. Now, the best way I can think of is to split the data set into several small data sets and hope that this problem does not occur on small data sets. Thanks

fabiogiglietto commented 1 year ago

it really boils down to the number of unique URLs shared by the posts you are starting from. A dataset of 60k Facebook posts may include a very different number of unique URLs depending on the type of posts collected. Please also keep in mind that due to a limit of CrowdTangle's API link endpoint some URLs (e.g. telegram bots) are incorrectly interpreted as a domain search. As a result, you may get back a large number of shares (CooRnet get up to 10k shares for each link) that are totally unrelated to your original URL.

aqibufu commented 1 year ago

it really boils down to the number of unique URLs shared by the posts you are starting from. A dataset of 60k Facebook posts may include a very different number of unique URLs depending on the type of posts collected. Please also keep in mind that due to a limit of CrowdTangle's API link endpoint some URLs (e.g. telegram bots) are incorrectly interpreted as a domain search. As a result, you may get back a large number of shares (CooRnet get up to 10k shares for each link) that are totally unrelated to your original URL.

Yes, I know the URLs are actually the unique URLs shared by the posts I'm starting from. :) The reminding is also make me surprise. I'm also met the error "unexpected http resoponse code 503 on call". Maybe 503 is the limit you remind me? Now, I split the whole posts dataset into several small datasets and ask to increase the rate limit of my token.

fabiogiglietto commented 1 year ago

Nope 503 is an http server error on CrowdTangle API side.

aqibufu commented 1 year ago

Nope 503 is an http server error on CrowdTangle API side.

Oh, So it is. Thanks