melff / memisc

Tools for Managing Survey Data, Creating Tables of Estimates and Data Summaries
https://melff.github.io/memisc
45 stars 8 forks source link

Multiple quotes and escape issues #14

Closed jciconsult closed 7 years ago

jciconsult commented 8 years ago

The variable descriptions in an SPSS dataset contain imbedded quotes Example abc 'The rain doesn''t fall on Sunday '
the result is a missing vbl error message. I could not get your software to process it unless I edited the duplicated single quotes to a pattern such as _. I tried using \' instead of _ but that did not work. That is not a particular problem. I can edit the descriptions but I can't replace the original descriptions with the edited ones. I am forgetting how to do bulk edits of your description. Do I have to use some kind of "for" loop? I tried to do an edit with descriptions(my.,ds)<-edited_descriptions_as_character_array by this does not work. string_replacement_problem.pdf

melff commented 8 years ago
  1. Regarding the quotations: Could you please show me the error message?
  2. For the bulk changes try something like:
ds <- within(ds,{
   foreach(v=as.symbols(vector.with.varnames),{
      description(v) <- gsub("\'","*",description(v))
   })
})

foreach is defined in memisc, in particular for the purpose of enabling bulk replacements.

jciconsult commented 8 years ago

My apologies for not getting back sooner but I was travelling to a conference. The message is missing quote. I include a zip file with 3 runs and 3 variants of the variable and label files. The edited version in which the repeated single quote is replaced with \ works nicely. Replacing the repeated single quotes with backslash quote does not work and the basic files from Statistics Canada do not work. I can send you a private share to a zip of the whole dataset if you wish.

Your package is most productive. I am working on a brief note on drug insurance coverage in my province. I am always working with weighted surveys. It would be nice to have a weighted code book option in memisc.

On Tue, Aug 16, 2016 at 5:58 AM, Martin Elff notifications@github.com wrote:

  1. Regarding the quotations: Could you please show me the error message?
  2. For the bulk changes try something like:

ds <- within(ds,{ foreach(v=as.symbols(vector.with.varnames),{ description(v) <- gsub("\'","*",description(v)) }) })

foreach is defined in memisc, in particular for the purpose of enabling bulk replacements.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/melff/memisc/issues/14#issuecomment-240059044, or mute the thread https://github.com/notifications/unsubscribe-auth/AUAF-rW0YzHHmwx_Y5rtP-p3Uiy9Y7v0ks5qgYnRgaJpZM4JiOrX .

Paul M. Jacobson Jacobson Consulting Inc. 80 Front Street East, Suite 720 Toronto, Ontario M5E 1T4 Voice: +1(416)868-1141 Email: pmj@jciconsult.com President CABE 2015-17

melff commented 7 years ago

Sorry for not coming back on this for a long while - teaching had started in September and ended just these weeks. I am wondering what the original problems with the quotes in labels is, because I have not been able to figure it out from your previous messages. The basic infrastructure does not seem to have problems with quotation marks as in:

a <- as.item(1:6)
description(a) <- "This is a test with 'quotes' (')"
labels(a) <- c("More quotation mark:' and \" "=1)
codebook(a)
====================================================================================================

   a 'This is a test with 'quotes' (')'

----------------------------------------------------------------------------------------------------

   Storage mode: integer
   Measurement: interval

                        Values and labels    N    Percent 

       1   'More quotation mark:' and " '    1   16.7 16.7
           (unlab.vld.)                      5   83.3 83.3

            Min:   1.000                                  
            Max:   6.000                                  
           Mean:   3.500                                  
       Std.Dev.:   1.708                                  
       Skewness:   0.000                                  
       Kurtosis:  -1.269