vnmabus / rdata

Reader of R datasets in .rda format, in Python
https://rdata.readthedocs.io
MIT License
45 stars 2 forks source link

NotImplementedError: Attributes not suported for LIST #10

Closed zoj613 closed 3 years ago

zoj613 commented 3 years ago

Hi, I randomly stumbled across this after seeing the Defferential entropy PR on scipy. So I decided to give a try, however I'm getting a strange error when I try to parse an RData file I have used previously for a research project. I get this error:

----> 1 parsed = rdata.parser.parse_file("./sa_cape_weaver_PC2_int.RData")

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_file(file_or_path)
    612             binary_file = buffer
    613         data = binary_file.read()
--> 614     return parse_data(data)
    615 
    616 

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_data(data)
    698         return parse_data(bz2.decompress(data))
    699     elif filetype is FileTypes.gzip:
--> 700         return parse_data(gzip.decompress(data))
    701     elif filetype is FileTypes.xz:
    702         return parse_data(lzma.decompress(data))

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_data(data)
    703     elif filetype in {FileTypes.rdata_binary_v2, FileTypes.rdata_binary_v3}:
    704         view = view[len(magic_dict[filetype]):]
--> 705         return parse_rdata_binary(view)
    706     else:
    707         raise NotImplementedError("Unknown file type")

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_rdata_binary(data)
    719     if format_type is RdataFormats.XDR:
    720         parser = ParserXDR(data)
--> 721         return parser.parse_all()
    722     else:
    723         raise NotImplementedError("Unknown file format")

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_all(self)
    285         versions = self.parse_versions()
    286         extra_info = self.parse_extra_info(versions)
--> 287         obj = self.parse_R_object()
    288 
    289         return RData(versions, extra_info, obj)

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_R_object(self, reference_list)
    366 
    367             # Read CAR and CDR
--> 368             car = self.parse_R_object(reference_list)
    369             cdr = self.parse_R_object(reference_list)
    370             value = (car, cdr)

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_R_object(self, reference_list)
    445 
    446             for i in range(length):
--> 447                 value[i] = self.parse_R_object(reference_list)
    448 
    449         elif info.type == RObjectType.S4:

~/.pyenv/versions/3.8.6/envs/lib/python3.8/site-packages/rdata/parser/_parser.py in parse_R_object(self, reference_list)
    360             tag = None
    361             if info.attributes:
--> 362                 raise NotImplementedError("Attributes not suported for LIST")
    363             elif info.tag:
    364                 tag = self.parse_R_object(reference_list)

NotImplementedError: Attributes not suported for LIST

Any workarounds for this?

vnmabus commented 3 years ago

It seems that I (wrongly) assumed that you could not have a list with attributes. It should be easy to change, just doing

attributes = self.parse_R_object(reference_list)
attributes_read = True

and changing the following elif to an if.

I think that should work. If you want to submit a PR (with a test case for this) I could have a look at this tomorrow.

vnmabus commented 3 years ago

I have fixed the parser so that it reads the attributes also for pairlists. Note that by default the converter discards this information, as the equivalent Python object does not have a use for it. Please, check the develop branch and tell me if that worked for your dataset.

zoj613 commented 3 years ago

@vnmabus I just tried it out. It parses the file without error now. Thanks for the fix.