ichbinkapil commented 1 year ago

Hi,

when using the full_archive_search function I get the same error again that you fixed a few weeks ago. I think the problem is again with the parsing functions. My codes are as follows:

query <- ""

timeframe

days <- seq(as.Date("2021-06-01"),as.Date("2021-07-31"), by = "day")

collect tweets around the vaccination debate

df <- NULL

for(i in 2:length(days)){ lower <- paste0(days[i-1],"T00:00:01Z") upper <- paste0(days[i],"T00:00:01Z")

tmp <- full_archive_search(token = Bearer_Token, search_query = query, start_time = lower, end_time = upper, tweet_fields = "ALL", user_fields = "ALL", n = 500, n_try = 10)

if(is.null(df)==TRUE){ df <- tmp

setwd(parent_path)

} else { df <- dplyr::bind_rows(df,tmp)

setwd(parent_path)

} cat(paste0(lower," to ", upper, " has been colleted!\n"))
}

Error I got is: Error in vecseq(f, len, if (allow.cartesian || notjoin || !anyDuplicated(f__, : Join results in 1643 rows; more than 209 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice.

The function works well to loading the data for about 10 days, but after that comes this error.

The package version is '0.2.6.4', which I believe is the latest.

Thank you in advance for your help and efforts.

Best

MaelKubli commented 1 year ago

Hi

I will have a look at it, but one question is your query really an empty string? Since this should not work if I am not mistaken.

ichbinkapil commented 1 year ago

Hi

Thank you for the quick reply. As my query consists of more than 15 keywords, I have not posted it here. An example would be:

query

query <- "Impfung OR Impfstoff OR Impfbereitschaft OR Impfgegner" and so on.

Best

MaelKubli commented 1 year ago

Thank you for the example. I will try and find a fix for it. One question do you by any chance know at what date this error occurs? This would make it easier for me to find an instance that triggers the error to find a suitable solution for it.

All the best, Maël

ichbinkapil commented 1 year ago

I tried the day before yesterday and got the error. I tried again yesterday but got the same error.

Thank you for your effects.

Best

MaelKubli commented 1 year ago

Hi

I just uploaded a new version! I really hope it fixes the issue...

Please try it out and tell me if you still get the error. If so I will have to dig deeper into it.

ichbinkapil commented 1 year ago

Hi

After I updated the new version '0.2.6.5' I tried again, but unfortunately I got the same error.

Best

MaelKubli commented 1 year ago

Well in this case it would really help a lot if I had the exact query (including date range) to reproduce the error on my end as well to start investigate where the problem occurs.

Otherwise it is really difficult to figure out what leads to the error exactly, since the error is indicating a merge error in the parser function.

ichbinkapil commented 1 year ago

Thanks again for your efforts. My exact codes are as follows:

query <- "Impfung OR Impfstoff OR impfen lassen OR Impfempfehlung OR Impfgegner OR Impfquote OR Impfbereitschaft OR Impfgegnerschaft OR Impfskeptiker OR Impfberatung OR Impfgeschädigte OR Impfschäden OR Impfwirkung OR Impfnebenwirkungen OR Impfreaktion OR Impftermin OR Impfmüdigkeit OR Impfnachweis OR Impfbescheinigung OR Impfverweigerer OR Impfverordnung OR Impfzertifikat OR Impfbefreiung lang:de"

days <- seq(as.Date("2021-07-01"),as.Date("2021-07-31"), by = "day")

df <- NULL

for(i in 2:length(days)){ lower <- paste0(days[i-1],"T00:00:01Z") upper <- paste0(days[i],"T00:00:01Z")

tmp <- full_archive_search(token = Bearer_Token, search_query = query, start_time = lower, end_time = upper, tweet_fields = "ALL", user_fields = "ALL", n = 5000, n_try = 10)

if(is.null(df)==TRUE){ df <- tmp

setwd(parent_path)

} else { df <- dplyr::bind_rows(df,tmp)

setwd(parent_path)

} cat(paste0(lower," to ", upper, " has been colleted!\n"))
}

readr::write_csv(df, "test_df2.csv")

Best

MaelKubli commented 1 year ago

Ok

I was able to reproduce the error and I found a fix, which I uploaded :)

If it works at your end I will close the issue.

ichbinkapil commented 1 year ago

It works now :)

Thank you for your efforts and consideration.

Best

MaelKubli / RTwitterV2

Issue with parsing functions while using full_archive_search function #12

timeframe

collect tweets around the vaccination debate

setwd(parent_path)

setwd(parent_path)

query

setwd(parent_path)

setwd(parent_path)