gesistsa / grafzahl

🧛 fine-tuning Transformers for text data from within R
https://gesistsa.github.io/grafzahl/
GNU General Public License v3.0
41 stars 2 forks source link

predict.grafzahl() seems to ignore "cuda = false" #19

Closed bachl closed 1 year ago

bachl commented 1 year ago

Hi @chainsawriot First of all, thank you for your work and another great package! After having read the CCR software announcement, I wanted to check out grafzahl. In particular, I wanted to see whether using it on my notebook without a CUDA GPU would make any sense at all. The setup process worked smoothly. I then replicated the Theocharis et al. (2020) example. Model training worked fine (although it needed 11.5h, but that was to be expected). However, the predict() step did not work. Input:

pred_bert <- predict(object = model, newdata = unciviltweets[test], cuda = FALSE)

Error:

"Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: 'use_cuda' set to True when cuda is unavailable. Make sure CUDA is available or set use_cuda=False"

I share a reproducible example below, with a nonsensical reduction of the training set to make it finish within a sensible timeframe. The error message remains the same.

Thanks again!

pacman::p_load(grafzahl, quanteda, caret, tictoc, tidyverse)
uncivildfm <- unciviltweets %>% tokens(remove_url = TRUE, remove_numbers = TRUE) %>% tokens_wordstem() %>% dfm() %>% dfm_remove(stopwords("english")) %>% dfm_trim(min_docfreq = 2)
y <- docvars(unciviltweets)[,1]
seed <- 123
set.seed(seed)
training <- original_training <- sample(seq_along(y), floor(.80 * length(y)))
test <- (seq_along(y))[seq_along(y) %in% training == FALSE]

set.seed(721)
tic()
model <- grafzahl(unciviltweets[original_training[1:20]], model_type = "bertweet", model_name = "vinai/bertweet-base", output_dir = here::here("reprex"))
toc()

pred_bert <- predict(object = model, newdata = unciviltweets[test], cuda = FALSE)
chainsawriot commented 1 year ago

@bachl Thank you for reporting the bug. It should have been fixed.

## standard testing case from Manning
require(grafzahl)
txt <- c(d1 = "Chinese Beijing Chinese",
          d2 = "Chinese Chinese Shanghai",
          d3 = "Chinese",
          d4 = "Tokyo Japan Chinese",
          d5 = "Chinese Chinese Chinese Tokyo Japan")
y <- factor(c("Y", "Y", "Y", "N", "Y"), ordered = TRUE)

model <- grafzahl(x = txt, y = y, train_size = 1, num_train_epochs = 1, model_name = "bert-base-cased", cuda = FALSE)
predict(model, cuda = FALSE)