jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
56 stars 29 forks source link

[Feature Request]: Combine data sets #1557

Open gsociology opened 2 years ago

gsociology commented 2 years ago

Description

Combine data sets

Purpose

combine data sets

Use-case

Not sure what this means

Is your feature request related to a problem?

Not a "problem", just a request for an additional feature

Describe the solution you would like

I would like JASP to have the ability to combine data sets, two or more, by an index, e.g. one or more variables common to both data sets.

Describe alternatives that you have considered

Other software, or doing it by hand.

Additional context

No response

juliuspfadt commented 1 year ago

Hi @gsociology, thanks for the request and apologies for the late reply. @JorisGoosen is this feasible?

JorisGoosen commented 1 year ago

Use-case

Not sure what this means

It means: explain to us how this would be used by people. This helps in determining priority and such.

is this feasible

I'm not sure, what exactly is the wish here? Maybe @gsociology can give us an example or two and then we also have some use-cases ^^

gsociology commented 1 year ago

Hi Is there a way for me to see the text of the comment that I made? At this point, I don't remember. Thanks Gene

JorisGoosen commented 1 year ago

Hi Gene,

You could try clicking the link "View it on github" or this: https://github.com/jasp-stats/jasp-issues/issues/1557

gsociology commented 1 year ago

Thanks. Got it. Can you combine data sets with JASP? Can you read in two data sets that have at least one variable in common, say ID, and use that to combine the data sets? Thanks Gene

JorisGoosen commented 1 year ago

Ah ok, but by explain how people will use it I mean to explain it in some more detail. I have no way to read your mind Im afraid.

Things I wonder when I read your description is:

marginalfutility commented 1 year ago

Dataset 1 has columns for specific variables, with different entries in each row, for the year 2020. Dataset 2 has the same columns for the same variables (or maybe one fewer, or one additional/different), for the year 2021. I could add a column with the value 2020 in each row to the first dataset, do the equivalent with the second dataset, and paste the second one beneath the first one in a csv, then import. (i.e., a survey is done each year, and each year’s survey data is released as a different csv). But this would be slow and cumbersome, especially if the columns don’t match perfectly in order. The new feature would let me import each dataset individually, and then merge them in the program.

tomtomme commented 5 months ago

@JorisGoosen This tutorial shows, how it is done in SPSS: https://ezspss.com/merging-files-in-spss/

The use cases are, merging by columns OR by rows: a) when you gathered data, say in 2020 and now this year gathered same data (columns) again, but in a different file. Now you want to merge both. But maybe some column names are not exactly the same or in another order. b) when you gathered data from 10 persons in 2020 and now gather NEW data but from the same persons again today. Now you need an ID in both files for each row and only want to extend columns to that row, wehn the ID (name of person etc.) matches.

JorisGoosen commented 5 months ago

Alright, well it seems like this could be a useful feature but it will probably be a while before we get to it.