Error on All Queries - "CHAR() can only be applied to a 'CHARSXP', not to a 'NULL'

tsoukinator commented 4 years ago

Just installed the package today, and ran into this error. Is this to do with my installation, or perhaps Google have changed something on their end?

Error in fromJSON(sub(".*=", "", html[data_line])) : CHAR() can only be applied to a 'CHARSXP', not a 'NULL'

Traceback:

| fromJSON(sub(".*=", "", html[data_line]))
ngram_parse(html)
| ngram_fetch(phrase, corpus_n, case_ins, ...)
| ngram_single(phrases, corpus = corp, year_start = year_start, year_end = year_end, smoothing = smoothing, tag = tag, case_ins)
FUN(X[[i]], ...)
| lapply(corpus, function(corp) ngram_single(phrases, corpus = corp, year_start = year_start, year_end = year_end, smoothing = smoothing, tag = tag, case_ins))
ngram(phrases, ...)
| ngramr::ggram("dog")

jirzii commented 4 years ago

I have the same problem. Would be very grateful for a fix!!

seancarmody commented 4 years ago

It's a bit of a cat and mouse game with ngram - Google doesn't provide an API so it's all about scraping and keeping up with their obfuscation changes. I have to admit it's been a long time since I wrote this package so it might take me a while to get back up to speed and work out what's broken! I'll see what I can do - but no promises...

seancarmody commented 4 years ago

I have pushed a new version to GitHub (not yet to CRAN). Let me know if the problem has been resolved.

tsoukinator commented 4 years ago

Thanks for getting back to this. It still appears there is another issue now.

Error in pivot_longer(df, -Year, names_to = "Phrase", values_to = "Frequency") : could not find function "pivot_longer"

4. | ngram_single(phrases, corpus = corp, year_start = year_start, year_end = year_end, smoothing = smoothing, tag = tag, case_ins)
3. FUN(X[[i]], ...)
2. | lapply(corpus, function(corp) ngram_single(phrases, corpus = corp, year_start = year_start, year_end = year_end, smoothing = smoothing, tag = tag, case_ins))
1. | ngramr::ngram("dog")

seancarmody commented 4 years ago

Hi - the package now requires tidyr, so try installing that package first and see if that helps. Sean.

seancarmody commented 4 years ago

I've also changed the way the tidyr function is called so you may need to reinstall ngramr too. Sean.

tsoukinator commented 4 years ago

Thanks Sean - when explicitly loading "library(tidyr)" after loading library(ngramr) this was able to work!

library(devtools)
install_github("seancarmody/ngramr")

library(ngramr)
library(tidyr)

dog <- ngramr::ngrami("dog")
dog

There might be some point in your package where you can state tidyr as a dependency, so it loads automatically for your users. In my package rhymebrainR, I have a devstuffs.R file, which contains the following dependencies of my package. Potentially stating tidyr in this fashion will allow the package to load with your ngramr package.

# Get the dependencies
use_package("httr")
use_package("jsonlite")
use_package("curl")
use_package("attempt")
use_package("purrr")

One last note I'll make, is on your package description page, the command you have to install/load your package from GitHub doesn't seem to work with my version of R (perhaps this was a previous method?).

To install the package from your GitHub account, I had to use the following code:

library(devtools)
install_github("seancarmody/ngramr")

Lastly Sean - thanks for providing this fix - assuming you're in the States, it should technically still be my birthday - and it is a sweet gift at that! I'm looking forward to using this package in one of my hobby projects! Thanks so much for fixing it! ~ Final edit, noticed you're another Aussie! In that case, cheers!

seancarmody commented 4 years ago

Thanks for the feedback! I thought I'd updated the dependency on tidyr but my R package skills are a bit rusty so I did it in the wrong place. I'm hoping it works now without the explicit call to library(tidyr). If there are still problems, do let me know.

On the install_github, you're right, the syntax for the function has changed, so I've updated the README.

Enjoy and a belated Happy Birthday.

tsoukinator commented 4 years ago

Thanks again Sean! This package truly makes my life easier, and I'm really looking forward to using it!

tsoukinator commented 4 years ago

Hey Sean,

Think something else might have broken, I'm now getting this issue.

dog <- ngramr::ngrami("dog")

Error in ddply(result, c("Year", "Corpus", "Phrase"), summarise, Frequency = sum(Frequency)) : could not find function "ddply"

On Thu, 23 Jul 2020 at 10:13, Sean Carmody notifications@github.com wrote:

Thanks for the feedback! I thought I'd updated the dependency on tidyr but my R package skills are a bit rusty so I did it in the wrong place. I'm hoping it works now without the explicit call to library(tidyr). If there are still problems, do let me know.

On the install_github, you're right, the syntax for the function has changed, so I've updated the README.

Enjoy and a belated Happy Birthday.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/seancarmody/ngramr/issues/26#issuecomment-662759362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKG7M6Y7JVHSG7WTSSH65LR456A5ANCNFSM4O3SXC3Q .

seancarmody commented 4 years ago

Sorry about that! Result of poorly executed switch from the (old) ddply to dplyr Should be fixed now.

tsoukinator commented 4 years ago

Thanks again Sean! Fixing the world one package at a time!

On Fri, Jul 24, 2020, 5:43 AM Sean Carmody notifications@github.com wrote:

Sorry about that! Result of poorly executed switch from the (old) ddply to dplyr Should be fixed now.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/seancarmody/ngramr/issues/26#issuecomment-663196946, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKG7M6U6WDO6GRDYFAAKWTR5CHGVANCNFSM4O3SXC3Q .

seancarmody commented 4 years ago

Note: I've redeveloped a lot of the underlying code which will (hopefully) make it more robust and provide better warnings. If you have a chance, have a look at the development version and let me know if it causes any problems (note if you use the "tag" argument in ngram that won't work anymore - I've removed it). When I'm confident it's stable I plan to replace the master version and republish on CRAN.

jirzii commented 4 years ago

Hi Sean,

thanks for doing all this work, I really appreciate it!

I've been away on holiday for two weeks. I did use the new code while away download just over 1000 records from Google (I still have about 13000 to go, which I may solve using alternative methods). Apart from having to overcome the usual server restrictions after a certain number of records had been processed (I can usually get around 80 records in one go) the code is very reliable. I had no problems obtaining the data. I'm still processing the results and will compare these with the underlying data, that I had also downloaded, but couldn't process due to memory restrictions of my laptop.

I don't think I have needed the tag argument. I have long lists of tokens, pass these to your function and the collect the output and store it for further work. We will cite your code in the paper this is going to be used for. Let me know, if you have a preferred version for the citation.

Here is the main segment of the "code" I use at the moment (where etc and XXX are replaced by appropriate content):

tokens<-list("asynchronously","barware","bender", etc... ) tokens_chr<-as.character(tokens)

#######

for(i in 1:XXX){ if (i==1){ng <- ngram(tokens_chr[i], year_start = 1960)} else {ngh <- ngram(tokens_chr[i], year_start = 1960) ng <- rbind(ng,ngh)} Sys.sleep(2.34) }

tail(ng) write.dta(ng, "ngrams_dwnld.dta") rm(ng, ngh)

Best,

G

Sean Carmody wrote on 05/08/2020 07:54:

Note: I've redeveloped a lot of the underlying code which will (hopefully) make it more robust and provide better warnings. If you have a chance, have a look at the development version and let me know if it causes any problems (note if you use the "tag" argument in ngram that won't work anymore - I've removed it). When I'm confident it's stable I plan to replace the master version and republish on CRAN.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/seancarmody/ngramr/issues/26#issuecomment-668995666, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQPUDBXKQSDTCTVDHCPUBLR7DX2BANCNFSM4O3SXC3Q.

seancarmody commented 4 years ago

I'm glad it's working for you. I've just pushed an update to the development version with a new function chunk that could be helpful to speed up your process a bit. Here's how I'd use it:

tokens <- c("asynchronously","barware","bender", etc... )
ng <- bind_rows(lapply(chunk(tokens, 12), function(c, ...) {Sys.sleep(2); ngram(c, ...)}, year_start = 1960))

Googe's Ngram Viewer will allow up to 12 tokens and this chunks your bigger list into a list of chunks of length 12 to work on.

seancarmody / ngramr

Error on All Queries - "CHAR() can only be applied to a 'CHARSXP', not to a 'NULL' #26