jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

[Feature Request]: Allow data to be read as panel data #1876

Open Apra119 opened 1 year ago

Apra119 commented 1 year ago

Description

Panel Data

Purpose

To allow students and researchers in Economics, Finance and Accounting as well others to implement Panel Data research on JASP

Use-case

To analyse data for multiple firms and multiple years or multiple countries and multiple years

Is your feature request related to a problem?

Panel data need to be defined prior the analysis in which it will give a better estimation result. In econometrics, we need to classify data into Time series, Cross Section and Panel data. Therefore classification of dataset structure is required

Describe the solution you would like

Allowing user to classify the structure of dataset prior the analysis. It can be when we just open a new dataset

Describe alternatives that you have considered

On the opening a new dataset

Additional context

Panel data also known as long format dataset.

In R, the structure of dataset can be set as panel data using a syntax : "pdata.frame" (https://search.r-project.org/CRAN/refmans/plm/html/pdata.frame.html)

It would be helpful if you granted the new feature request, and it would be very useful for students and researchers that use panel data.

juliuspfadt commented 1 year ago

@Apra119, thanks for the request. Could you have a look how similar your request is to this one: https://github.com/jasp-stats/jasp-issues/issues/1382?

Apra119 commented 1 year ago

@juliuspf, thank you for the reply. Yes, I have a look at #1382, and It is similar. However, If possible, I would like to request this feature of the structure of the data (Time series, cross-sectional and panel) can be determined when we open a new data. I apologize for any inconvenience due to the request.

juliuspfadt commented 1 year ago

No need to apologize. I will adjust the title and contemplate whose expertise this falls under

Apra119 commented 1 year ago

Thank you for your kind attention

anschmieg commented 11 months ago

Hello @juliuspfadt and @JorisGoosen,

please apologize the annoyance, but I would like to inquire about the progress regarding this. As a fellow economist the issue is highly relevant for me, and as of now I would say it is one of the main reasons I have to stick with gretl over JASP.

I would appreciate an update on whether, and when to expect such a feature.

Thanks and kind regards

EJWagenmakers commented 11 months ago

We have three new parttime programmers on board (courtesy of the Utrecht University) and if I am not mistaken this was one of the issues that they were going to fix together with @JorisGoosen , right?

JorisGoosen commented 11 months ago

Is this related to the support for long dataformat? https://github.com/jasp-stats/jasp-issues/issues/33

Because that would indeed be upcoming in the shortterm

Apra119 commented 11 months ago

No, panel data is not the same as long data format. Panel data, also known as longitudinal data or cross-sectional time series data, is a type of dataset that includes observations on multiple entities (such as individuals, firms, or countries) over multiple time periods. Each entity is observed repeatedly over time, resulting in a structured dataset with both cross-sectional and time-series dimensions.Panel data allows for the analysis of both within-entity (cross-sectional) and across-entity (time-series) variations, making it useful for studying individual and aggregate behavior, as well as examining the effects of policies or interventions over time.

On the other hand, long data format refers to a specific way of organizing data within a panel data structure. In the long data format, each observation is represented as a separate row, with variables indicating the entity, time period, and corresponding values. This format is often used when transforming panel data into a format suitable for certain statistical analyses or modeling techniques.

In summary, panel data refers to the type of dataset that includes observations on multiple entities over time (a type of data that includes both cross-sectional and time-series dimensions), while long data format is a specific way of organizing data within a panel data structure.

Note: the above answer based on ai and web....

Time series, cross-section and panel data, analysis are higghly in use for social science specifically for economics, business, finance and accounting.

Yes, similar to @schmiega, we use Gretl to do those dataset analysis. It has initial menu for user to classify type of data either time-series, cross-section or panel data.

JorisGoosen commented 11 months ago

Allright, well, im not sure if this is the "AI" or @Apra119 but in any case, the issue specifically says:

Additional context Panel data also known as long format dataset.

???

EJWagenmakers commented 11 months ago

Background: https://www.aptech.com/blog/introduction-to-the-fundamentals-of-panel-data/#:~:text=Panel%20data%20is%20a%20collection,people%2C%20countries%2C%20and%20companies.

Apra119 commented 11 months ago

Thank you, I am sorry I do not know so much about the meaning of long data format. I might be wrong However, according to the link provided by @EJWagenmakers it is said panel data can be in the form of both long data and wide data format (under subtitle of "Wide and Long Panel Datasets")

Most importantly, Panel Data can be balanced and unbalanced dataset. In the fields of economics, banking, finance, and accounting research, it is quite common to work with unbalanced panel datasets, which means that the number of observations for each entity may vary across the time periods

Typically, when working with panel datasets, it is necessary to designate or generate an "ID" or "group" column and a "time" or "period" column. This allows other statistical software such as Stata and R to determine whether the data is balanced (with consistent observations for each entity over time) or unbalanced (with varying observations for each entity over time). Within the context of panel data analysis, the time dimension can be represented at various frequencies, including monthly, quarterly, and yearly intervals.