vnmabus / rdata

Reader of R datasets in .rda format, in Python
https://rdata.readthedocs.io
MIT License
40 stars 2 forks source link

Bug in compact sequence constructor #29

Closed trossi closed 8 months ago

trossi commented 8 months ago

Hi! Thank you for creating this great package!

I noticed that the compact sequence constructor seems to have a bug. I'm testing with R 4.2.1, but the issue is likely present in general unless the file format has changed at some version.

Here is a minimal working example writing a RDS file in R and reading it back in R or Python with rdata:

> Rscript -e "a <- -5:5; print(a); saveRDS(a, 'test.rds')"
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5
> Rscript -e "a <- readRDS('test.rds'); print(a)"
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5
> python -c "import rdata; a = rdata.conversion.convert(rdata.parser.parse_file('test.rds')); print(a)"
[-5 -4 -3 -2 -1  0  1  2  3  4  5  6  7  8  9 10]

The same issue seems to be in compact_realseq, but I couldn't find how to create such in R (expect modifying RDS file manually).

A workaround for int sequences:

import numpy as np
import rdata

def compact_intseq_constructor(state):
    info = rdata.parser.RObjectInfo(
        type=rdata.parser.RObjectType.INT,
        object=False,
        attributes=False,
        tag=False,
        gp=0,
        reference=0,
    )
    n = int(state.value[0])
    start = int(state.value[1])
    step = int(state.value[2])
    stop = start + (n - 1) * step
    value = np.array(range(start, stop + 1, step))
    return info, value

altrep_constructor_dict = {**rdata.parser.DEFAULT_ALTREP_MAP}
altrep_constructor_dict[b"compact_intseq"] = compact_intseq_constructor

data = rdata.parser.parse_file(
    'test.rds',
    altrep_constructor_dict=altrep_constructor_dict,
)

data_dict = rdata.conversion.convert(data)

print(data_dict)

I'll submit a PR with a similar fix.

vnmabus commented 8 months ago

I can reproduce. I will add the tests to ensure that it is solved.

vnmabus commented 8 months ago

Thank you for noticing and fixing the bug! I added tests to ensure that this case is covered. The fix is now on the develop branch.

trossi commented 8 months ago

Great! Thank you for the quick response!