ajdamico / lodown

locally download and prepare publicly-available microdata
GNU General Public License v3.0
97 stars 47 forks source link

WVS catalog #149

Closed sebacea closed 5 years ago

sebacea commented 5 years ago

I'm not able to succesfully retrieve WVS catalog.

Running

wvs_cat <-
  get_catalog( "wvs" ,
               output_dir = file.path( path.expand( "~" ) , "WVS" ) )

I got

Warning messages:
1: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=-1&SAID=-1&INID='
2: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=1'
3: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=2'
4: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=3'
5: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=4'
6: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=5'
7: In readLines(dl_page) :
  incomplete final line found on 'http://www.worldvaluessurvey.org/AJDocumentationSmpl.jsp?CndWAVE=6'
briatte commented 5 years ago

Hi @sebacea –

The warnings you are reporting should not prevent the catalog from downloading. I just ran the wvs_cat function and got the following results:

  wave this_id
1   -1    3843
2   -1    3844
3   -1    8390
                                                                                                                       full_url
1              http://www.worldvaluessurvey.org/wvsdc/CO00001/F00003843_WVS_EVS_Integrated_Dictionary_Codebook_v_2014_09_22.xls
2 http://www.worldvaluessurvey.org/wvsdc/CO00001/F00003844_WVS_Values_Surveys_Integrated_Dictionary_TimeSeries_v_2014-04-25.xls
3                           http://www.worldvaluessurvey.org/wvsdc/CO00001/F00008390-WVS_Longitudinal_1981_2016_r_v20180912.zip
        output_folder
1 ~/WVS/longitudinal/
2 ~/WVS/longitudinal/
3 ~/WVS/longitudinal/

… which appear to be correct. Do not take the -1 wave number into account, this is how integrated/longitudinal files are marked in lodown.

@sebacea – perhaps you should close this issue if you did manage to download the data despite those warnings?

@ajdamico Hello again ~ I believe you can remove the warnings reported above by adding the warn = FALSE argument at line 31 of wvs.R.

sebacea commented 5 years ago

Hi @briatte, thank you for your answer. You are right regarding the use of get_catalog(). Nevertheless, the examples in http://asdfree.com/world-values-survey-wvs.html no longer work if waves are tagged -1

briatte commented 5 years ago

Correct: I suspect this is because this line is not sending back the table of country-specific downloads that it apparently sent back before. It looks like the page now needs a valid WVS cookie to display in full.

@ajdamico does that look like something that you would know how to fix? If not, I'll explore the problem in the next two weeks, to understand how it works (I'm afraid I know less about setting cookies than you do!).

ajdamico commented 5 years ago

hi, i'm not sure it's worth keeping up with their changes to the website..

briatte commented 5 years ago

I think I found a fix, re-using @ajdamico's wvs_appreq function in a few places, since the website seems to have changed only in its stricter requirement for a session cookie.

Downloading the full catalog (wave-specific + country-specific files + longitudinal of course) now seems to work (output copied below), so I guess the issue can be closed. Please note, though, that I have not tested the actual download function.

> get_catalog_wvs(output_dir = "wvs_test")
loading wvs catalog for wave -1

loading wvs catalog for wave 1

loading wvs catalog for wave 2

loading wvs catalog for wave 3

loading wvs catalog for wave 4

loading wvs catalog for wave 5

loading wvs catalog for wave 6

    wave this_id
1     -1    3843
2     -1    3844
3     -1    8390
7      1    8361
8      1    8349
9      1    8339
10     1    8319
11     1    8329
12     1    8362
...

full_url column:

full_url
1                http://www.worldvaluessurvey.org/wvsdc/CO00001/F00003843_WVS_EVS_Integrated_Dictionary_Codebook_v_2014_09_22.xls
2   http://www.worldvaluessurvey.org/wvsdc/CO00001/F00003844_WVS_Values_Surveys_Integrated_Dictionary_TimeSeries_v_2014-04-25.xls
3                             http://www.worldvaluessurvey.org/wvsdc/CO00001/F00008390-WVS_Longitudinal_1981_2016_r_v20180912.zip
7                               http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008361-WV1_Results_Argentina_1984_v20180912.pdf
8                           http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008349-WV1_Data_Argentina_1984_Excel_v20180912.xlsx
9                           http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008339-WV1_Data_Argentina_1984_Excel_v20180912.xlsx
10                            http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008319-WV1_Data_Argentina_1984_Spss_v20180912.zip
11                           http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008329-WV1_Data_Argentina_1984_Stata_v20180912.zip
12                              http://www.worldvaluessurvey.org/wvsdc/DC00346/F00008362-WV1_Results_Australia_1981_v20180912.pdf
...

output_dir column:

             output_folder
1   wvs_test/longitudinal/
2   wvs_test/longitudinal/
3   wvs_test/longitudinal/
7         wvs_test/wave 1/
8         wvs_test/wave 1/
9         wvs_test/wave 1/
10        wvs_test/wave 1/
11        wvs_test/wave 1/
12        wvs_test/wave 1/
...
briatte commented 5 years ago

Note -- just successfully ran the code in the "Simplified Download and Importation" of the WVS vignette. It just needs an update on the filename:

F00007569-WV6_Data_United_States_2011_Spss_v20180912.rds