jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

[Bug]: Failed to import data big data file #2837

Closed Arvydas21 closed 2 months ago

Arvydas21 commented 2 months ago

JASP Version

0.19.0

Commit ID

No response

JASP Module

Unrelated

What analysis are you seeing the problem on?

No response

What OS are you seeing the problem on?

Windows 11

Bug Description

Failed to import data from .sav and .csv file. There are >1500 attributes (columns). I reduced the characters to 967, import from a .csv file, save as Research result DB_am_1d.jasp, but the program crashes during startup, reports that the file location is bad. There are 444 entries. What's the problem that I can't run Jasp? I can send the data file.

Expected Behaviour

Failed to import data Tyrimo rezultatu DB_am_1d.zip

Steps to Reproduce

  1. Sumažinau požymių iki 967
  2. ...

Log (if any)

No response

Final Checklist

RensDofferhoff commented 2 months ago

JASP should be able to handle up to 1500 columns. I will have a look

RensDofferhoff commented 2 months ago

Could you share the data file?

shun2wang commented 2 months ago

@RensDofferhoff the data file is in bug report above. I think something different here.

RensDofferhoff commented 2 months ago

Thats the JASP file. I hope for the original csv

tomtomme commented 2 months ago

There were problems in the past with >1000 columns because of the sqlite backend.

RensDofferhoff commented 2 months ago

I had the same thought but the COLUMN_MAX should be 2k by default. I will make a little test file to check

RensDofferhoff commented 2 months ago

Ah I see. The limit is 999 Columns currently. We must improve this soon. But even if I recompile sqlite to handle 32k columns the performance of loading that 999 columns is lacking. So its not an easy fix

shun2wang commented 2 months ago

yes, it's depends how we store these tables, we may store them as relational tables indexed by primary keys but not only single table to improve query performance and limit.

RensDofferhoff commented 1 month ago

A PR was merged that allows loading of 16k columns in reasonable time: https://github.com/jasp-stats/jasp-desktop/pull/5651