fsolt / ropercenter

Reproducible Data Retrieval from the Roper Center Data Archive
http://fsolt.org/ropercenter/
Other
4 stars 2 forks source link

New Roper Center IDs don't work #7

Closed fsolt closed 4 years ago

fsolt commented 4 years ago

https://twitter.com/gronke/status/1280531757368963072

Screen Shot 2020-07-13 at 8 56 24 AM

cc: @paulgronke, @lin-jennifer

fsolt commented 4 years ago

I'm supposed to hear from Roper's engineering team today, btw (their head was on vacation last week).

lin-jennifer commented 4 years ago

Hi:

Thank you for doing this!

I am having problems even with the historical archive IDs. If I run

roper_download("USPEW2015-GOVERNANCE")

which is a Pew research poll on Roper Center from 2015, I get the following error:

Warning message:
In value[[3L]](cond) :
  Conversion from .por to .RData failed for USPEW2015-GOVERNANCE

In the download directory, I get 2 empty files of what should be the codebook and data but neither can open.

Thank you so much and sorry about the constant bother.

fsolt commented 4 years ago

Thanks, Jennifer! No bother; this is good to know. Anyway, the whole package is going to need to be re-written from scratch for the new website. The only question is whether they have an API I'll be able to take advantage of or whether instead I'll have to use RSelenium (which opens up a Chrome window and drives it around; much less elegant).

fsolt commented 4 years ago

I just heard back from Roper that there's currently no API, though they're working on one. So I'll be migrating the pkg to RSelenium as soon as I have a chance--say, within three days? Let's see

fsolt commented 4 years ago

Okay, the current GitHub version is working for me. Would either of you want to give it a try for me, please, @paulgronke, @lin-jennifer?

lin-jennifer commented 4 years ago

Hi:

Thank you for updating this.

I ran:

roper_download("USPEW2015-GOVERNANCE")

and

roper_download(31114069)

and I get this error both times

Selenium message:session not created: This version of ChromeDriver only supports Chrome version 84
Build info: version: '4.0.0-alpha-2', revision: 'f148142cf8', time: '2019-07-01T21:30:10'
System info: host: 'Jennifers-MacBook-Pro.local', ip: '2601:586:8301:6690:0:0:0:d1d%en0', os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.15.5', java.version: '12.0.1'
Driver info: driver.version: unknown
remote stacktrace: 0   chromedriver                        0x000000010db4ac49 chromedriver + 4893769
1   chromedriver                        0x000000010dae40e3 chromedriver + 4473059
2   chromedriver                        0x000000010d7578fd chromedriver + 751869
3   chromedriver                        0x000000010d6b93b9 chromedriver + 103353
4   chromedriver                        0x000000010d6b5696 chromedriver + 87702
5   chromedriver                        0x000000010d6b29b9 chromedriver + 76217
6   chromedriver                        0x000000010d6e5043 chromedriver + 282691
7   chromedriver                        0x000000010d6e1e43 chromedriver + 269891
8   chromedriver                        0x000000010d6bb62a chromedriver + 112170
9   chromedriver                        0x000000010d6bc635 chromedriver + 116277
10  chromedriver                        0x000000010db0c5af chromedriver + 4638127
11  chromedriver                        0x000000010db1991b chromedriver + 4692251
12  chromedriver                        0x000000010db196bb chromedriver + 4691643
13  chromedriver                        0x000000010daf0109 chromedriver + 4522249
14  chromedriver                        0x000000010db19ea3 chromedriver + 4693667
15  chromedriver                        0x000000010db02073 chromedriver + 4595827
16  chromedriver                        0x000000010db31094 chromedriver + 4788372
17  chromedriver                        0x000000010db50db7 chromedriver + 4918711
18  libsystem_pthread.dylib             0x00007fff6c365109 _pthread_start + 148
19  libsystem_pthread.dylib             0x00007fff6c360b8b thread_start + 15

Could not open chrome browser.
Client error message:
     Summary: SessionNotCreatedException
     Detail: A new session could not be created.
     Further Details: run errorDetails method
Check server log for further details.
Error in checkError(res) : 
  Undefined error in httr call. httr output: length(url) == 1 is not TRUE

Is there something that I am missing?

Thank you!

fsolt commented 4 years ago

Blech. That's why I'd really hoped to avoid using RSelenium; you need to have just the right version of Chrome installed or it fails. I'd sorta hoped that it had gotten better, but naturally no: it has to be the dev version of Chrome.

Soooo, right. More documentation necessary to explain. I'll work on that and get back to you shortly. And thanks!

fsolt commented 4 years ago

Okay, I see that Google has just now (well, on July 14) released v84 as its regular version of Chrome for both Mac and Windows. So updating Chrome at https://www.google.com/chrome/ should do the trick for you? Let me know please. If so, I'll update the docs to let people know. If not, I'll have to figure out what else to do to get it working for you.

lin-jennifer commented 4 years ago

Hi:

Thanks, again, for this.

Generally speaking, it works with the new Chrome update.

I ran the same codes that I mentioned above and was able to get files that could open.

However, I should note that:

  1. For files with "Historical Archive Numbers", those didn't work for me -- instead, using the "Archive Number" would do the trick.

  2. However, for these Historical files, ".por" files did not, for me at least, convert to .RData. That is not a problem for me, per se, but I'm just noting it for reference.

  3. For every time I run roper_download(), I had to shut down the Chrome that opened on my previous download in order for it to work, otherwise, I get this error

    Error in wdman::selenium(port = port, verbose = verbose, version = version,  : 
    Selenium server signals port = 4567 is already in use.
  4. And also, even if the download is successful, they close with this error:

    
    Selenium message:no such element: Unable to locate element: {"method":"css selector","selector":"#rc-downloads-tc-modal-accept-btn"}
    (Session info: chrome=84.0.4147.89)
    For documentation on this error, please visit: https://www.seleniumhq.org/exceptions/no_such_element.html
    Build info: version: '4.0.0-alpha-2', revision: 'f148142cf8', time: '2019-07-01T21:30:10'
    System info: host: 'Jennifers-MacBook-Pro.local', ip: '2601:586:8301:6690:0:0:0:d1d%en0', os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.15.5', java.version: '12.0.1'
    Driver info: driver.version: unknown

Error in file_id[[i]] : subscript out of bounds



I don't know if any of the error messages that I got are unique to me. I just wanted to put it out there to make a note of it and to see if anyone else has that error. I am not sure if they are causes for concern either now or later. Downloads were successful nonetheless.

Again, thank you SO MUCH for your help with this!
fsolt commented 4 years ago

Thanks, Jennifer! Hmm.

Part of 4--where it says Selenium message-- is partly expected (and unstoppable) chattiness from Selenium, but the error about subscript out of bounds is something going wrong. And when something goes wrong, the rest of the function—the conversion in 2, the automatic shutdown in 3—doesn't get a chance to run. Can you give me an examples or two of the file ids you're working with so I can try to reproduce the problem and figure out exactly what happened?

As for 1, well, I might just have to give up on the "Historical Archive Numbers". Actually, before I do, if you could give me a couple of examples that didn't work for you?

lin-jennifer commented 4 years ago

These are the ones I ran

# Historical Archive -- following 2 reference the same thing
roper_download("USPEW2015-GOVERNANCE")
roper_download(31096295) # Archive number for (""USPEW2015-GOVERNANCE")

# "New" data Archive number
roper_download(31114069)

THANKS!

fsolt commented 4 years ago

Oh, hmm. "USPEW2015-GOVERNANCE" downloads fine for me here; I even put it in the examples!

fsolt commented 4 years ago

So, to clarify: When you ran roper_download(31114069), you encountered no problems? It gives you an .RData file and quits Chrome? It was only roper_download(31096295) that gave you those issues?

lin-jennifer commented 4 years ago

Nevermind, roper_download("USPEW2015-GOVERNANCE") works -- not sure why it didn't when I first tested it -- my bad.

As for quitting Chrome, no, neither version quit Chrome; I had to do that manually both times.

roper_download(31114069) gives me an .RData, the other one (roper_download("USPEW2015-GOVERNANCE")) does not. Would it have something to do with Error in roper_download("USPEW2015-GOVERNANCE") : object 'spss' not found?

Thank you!

fsolt commented 4 years ago

Yes, that error is helpful. Okay, I just got the Error in file_id[[i]] : subscript out of bounds error here. Looks like I introduced a bug just before I wrote you. Blech. Time for a break from coding.

fsolt commented 4 years ago

Okay, I think I've got stuff fixed up now. Give it another try when you have a chance, please?

lin-jennifer commented 4 years ago

Thank you for updating this! I just tested

roper_download("USPEW2015-GOVERNANCE")
roper_download(31114069)

and they both worked great!

Thanks again!

fsolt commented 4 years ago

Thanks, Jennifer! I appreciate you helping me stamp out these last few bugs!

fsolt commented 4 years ago

Now on CRAN