jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
59 stars 29 forks source link

[Feature Request]: JASP file without data (only analysis) #2743

Open patc3 opened 6 months ago

patc3 commented 6 months ago

Description

Possibility to export or otherwise share the analyses (with or without output) without the data

Purpose

Share analysis without data

Use-case

Collaborate on a JASP file (e.g. using git with a cloud remote) without sharing confidential data; another use case: prepare an analysis template (e.g. for colleagues, students, etc.) and they can pick the dataset to which apply the analyses (this is possible currently, but the file necessarily contains data from the start)

Is your feature request related to a problem?

Want to track changes using git and cloud remote (to collaborate), but would entail uploading data as well

Is your feature request related to a JASP module?

Unrelated

Describe the solution you would like

JASP file format that doesn't contain the data

Describe alternatives that you have considered

R script

Additional context

No response

tomtomme commented 6 months ago

Would it not be possible to do this with some fake data? Plenty such data sets are available in jasps data library.

boutinb commented 6 months ago

When we have R syntax complete (I hope this year), then this would be possible.

patc3 commented 6 months ago

Thanks @boutinb

@tomtomme conceptually it's feasible, but practically it's not that simple. I tried yesterday with two real datasets, I couldn't make it work. I tried removing all rows, but saving and later syncing the real data wouldn't work (several analyses broke: mediation and descriptives broke, while SEM and filters worked, for example). So instead I tried to make a mock dataset and sync the real one later: I exported the datasets from JASP, opened them in R and randomized each column (so as to keep their types) and synced them. When I was using only a subset of the rows (so as to minimize the amount of data exposed), several things broke (e.g. sometimes variables would change types because not all values were observed and it made them ordinal). When I was using the entire dataset, it almost worked, but variables that had values and labels were exported with values only, and in this real project I had a filter that was using labels, which were lost when I exported the dataset to randomize it in R--so in this specific case, the filter broke.

So conceptually yes it's "feasible", but in practice no I don't think it's feasible. Mind you, I needed the R knowledge to randomize the data; but this is a moot point, as simply randomizing the entire dataset doesn't make sharing the data acceptable (at least not in either Canada or the US, the two countries I'm familiar with). So the reality is I couldn't make it work, and I did give it a good try.