jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
59 stars 29 forks source link

[bug]: export dataset from REDCap to JASP with labels #1581

Open iijpma opened 2 years ago

iijpma commented 2 years ago

JASP Version

0.15

Commit ID

No response

JASP Module

No response

What analysis are you seeing the problem on?

No response

What OS are you seeing the problem on?

Windows 10

Bug Description

We want to import our datasets from REDCap to JASP. We can export an .csv file (raw data or labels), .sps file, .r file, .sas syntax file, .do file (STATA) and CDISC ODM file. When importing the *.csv (the only supported file in JASP), we would like to have the corresponding data-labels in JASP. However, this is not the case. Since we do not have other statistical programs (we only want to use JASP), we cannot use other export options from Redcap (spss, sas or stata-files). How can we export datasets from REDCap to JASP in a way that we have all information in one file (the numbers + corresponding labels) in JASP?

Expected Behaviour

Export with all information in one file (the numbers + corresponding labels).

Steps to Reproduce

(question)

Log (if any)

No response

Final Checklist

shun2wang commented 2 years ago

Hi @iijpma , sorry this is probably not a JASP related bug. but as far as I know, the data with labels and unlabeled exported from REDcap differ only in the storage format of the values (numbers or characters), which .csv file is universal and can be imported normally in JASP.

For your question, you can also export to a supported data format (.sav,.dta.....) from REDcap, taking SPSS data as an example, you can do as follows:

  1. Download all three files (.bat, .csv, *.sps) to the same folder from the REDcap SPSS file export column
  2. Double-click to run the .bat file
  3. Double-click to run the .sps syntax file (if you have SPSS installed), select Run All
  4. now you may see a new .sav dataset generated.

If you can't use commercial software, perhaps the R syntax provided by REDcap can still be tried,If the above solution does not work, you may need to manually edit the variable name and value level labels after importing the csv file. please close this issue when the above solves your problem .

Cheers

iijpma commented 2 years ago

Thank you for the response. We do not have access to other statistical programs (SPSS, SAS or STATA), so we can only use JASP. Therefore, we can only use .csv and not the other supported data formats to export the datasets to JASP. What do you mean with 'perhaps the R syntax provided by REDcap can still be tried'? How does that work in JASP? Indeed, manual editing is an option, but we have very large datasets with many (>100) variables). So not really feasible and sensitive to errors.

shun2wang commented 2 years ago

so we can only use JASP

Sorry I may have overlooked this, but it still may not be a jasp bug, maybe a feature request.

For the R syntax I'm talking about, you can also export 2 files (.r & .csv) from REDcap.then:

Note that you may need some basic knowledge of R to use the above functions,and read the help documentation for these packages.This is not a JASP recommended practice, but may solve your problem temporarily

boutinb commented 2 years ago

Hi @iijpma, JASP can load SPSS files with sav and par formats, STATA files with .dta format, and SAS files with as7bdat and sas7bcat formats. All these formats contain the labels. That would be quite a bad luck if REDCap cannot convert to at least of these formats... As @icekylin said, you can still use R with the haven library, or even python with the pandas library to convert a R object into a sav file.

JorisGoosen commented 2 years ago

As an aside and to go more towards the feature request @icekylin mentioned.

But the next version of jasp (2 months out probably) will be with much better data-editing support and one of the features I was planning to implement was making sure that nominal-text-columns (as I assume your REDcap label csv data ends up as) are actually convertible to ordinal/nominal and perhaps even scale (losing information there). See https://github.com/jasp-stats/jasp-issues/issues/1633

Would that actually solve your problem @iijpma ?

shun2wang commented 2 years ago

@JorisGoosen I think this feature request is perhaps a related data file generation (by values and labels). but I agree with your feature request about variable type conversion.Here is something about value and label records in sav (which I think you probably already know, since Readstat is mostly built on these standards).

I will give you two files that exported from REDcap( R object files and csv data ). It can generate a dataset with labels following the steps I said. In fact, in SPSS and STATA software, data can be stored separately from labels. For example, the .sps I provided only stores labels and commands to generate datasets. dataset.zip

github-actions[bot] commented 1 year ago

This issue or pull request will be automatically closed in 42 days due to inactivity. Feel free to leave a comment if you believe this is still relevant.

tomtomme commented 9 months ago

@boutinb To this day there is still no direct export of SAV (or the other file formats) within RedCap. I looked it up in their documentation. RedCap always only exports the csv and the syntax and a bat.

@shun2wang & @JorisGoosen The solution you propose - could that kind of SAV-generation be automated within JASP? Like: "import from R- and CSV-file to get labels and other metadata", where you point JASP to those two files and then it does the magic for you?

related: https://github.com/jasp-stats/jasp-issues/issues/611

JorisGoosen commented 8 months ago

The solution you propose - could that kind of SAV-generation be automated within JASP? Like: "import from R- and CSV-file to get labels and other metadata", where you point JASP to those two files and then it does the magic for you?

We could add a "labels" import to the csv import. Where labels would be in json format. { column0: { "0": "label 0", "1": "label 1"} }

This could be a nice feature, if combined with some R code of ours that people could run to generate the json with for instance. But it will be a while before we would get to that. It makes more sense to focus on importing all data from ReadStat correctly first. And if we are adding a new importer then it should probably be from Excel

I think running R code during data import from random sources would be a pretty terrible idea btw.