JanMarvin / readspss

Read the SPSS file formats
https://janmarvin.github.io/readspss/
GNU General Public License v2.0
12 stars 1 forks source link

Implement reading of files without a valid n #3

Closed JanMarvin closed 6 years ago

JanMarvin commented 6 years ago

Pspp documentation states that n must not be valid. It can be -1 if not set correctly when writing the file. Such files are currently not read. All files at hand have a valid n which makes me believe, that it's a minor problem. Still it requires to either implement some kind of increasing vectors (convert to CharacterVector or NumericVector at the last possible time) or to calculate the number of cases using the file length left to read divided by the width of a row. Latter approach seems easier and less troublesome, though requires every row to be of identical width.

JanMarvin commented 6 years ago

Calculation is harder than dataend.tellg() - databegin.tellg() . The size of each row can vary depending on how many integers are stored in a double etc. Maybe it is possible to use varmat to calculate a minimal width. Still two rows can be in a single double. Most likely there is no other way than a dry run, to figure out the actual size of observations. Or filling vectors dynamically, while pushing back values.

JanMarvin commented 6 years ago

Should be fixed in current master. Implementation is not pretty, but it works at least on files, that have a sane ending. Requires some cleanups and fixing of sav.eof usage.

JanMarvin commented 6 years ago

Fixed in master