mjwestgate / revtools

Tools to support research synthesis in R
https://revtools.net
48 stars 26 forks source link

Topic Model: 'arguments imply differing number of rows' #3

Closed DesiQuintans closed 6 years ago

DesiQuintans commented 6 years ago

This bibliography from ScienceDirect is correctly loaded by read_bibliography() with no errors, but when I run start_review_window() it produces this error message:

Error in data.frame(id = rownames(dtm), label = info$label[which(x_keep)],  : 
  arguments imply differing number of rows: 867, 879

I tried subdividing this large file and loading each chunk separately to see whether I could narrow the problem down to a single malformed record, but when I do this, the proportions of the good/bad rows reported in the error message change. For example, loading the second half of the linked file might report 299, 308 (9 bad records). But splitting that file in half again can lead to successful execution for the first half, and 125, 127 (2 bad records) for the second half. So it seems that there's some interaction happening here. Wish I had more info for you!

mjwestgate commented 6 years ago

This was a silly mistake on my part! The code can only plot articles for which there are sufficient words to analyse. Your dataset was lacking abstracts for some articles, so the number that could be modelled was less than the size of the whole dataset. This shouldn't be a problem; the code was supposed to test for this issue before it calculates the model, but instead it ran the model first, and then deleted missing rows afterwards. Therefore the different parts of the data sent to the shiny window had different numbers of rows, and couldn't be combined. I've fixed and tested this with your code; but again, let me know if it fails on your computer and I'll check again!

DesiQuintans commented 6 years ago

Fixed, thanks!