vnmabus / rdata

Reader of R datasets in .rda format, in Python
https://rdata.readthedocs.io
MIT License
40 stars 2 forks source link

HELP ::: RNA data .rda in python #8

Closed Jorgelindo238 closed 3 years ago

Jorgelindo238 commented 3 years ago

Hi

Tryng to run rdata on Jupyter notebook, I've got this issue

parsed = rdata.parser.parse_file("TCGA_eset.rda") converted = rdata.conversion.convert(parsed) converted

==> NotImplementedError: Type RObjectType.S4 not implemented

Any idea ? Best

vnmabus commented 3 years ago

I did not implement a conversion routine from S4 objects to Python objects, mostly because almost everyone uses S3 objects, so I did not need it.

If you want to try a PR, you would need to add a new if branch here:

https://github.com/vnmabus/rdata/blob/3154a6c865e276c1bc4d7047cc74fefc3ff7e6bf/rdata/conversion/_conversion.py#L558-L563

I do not know right now how S4 objects are internally implemented, so I am not sure if it is an easy task or a very difficult one, but if you want to try it we could discuss the problems that arise.

Jorgelindo238 commented 3 years ago

Thank you, I have a problem here now.

from .. import parser from ..parser import RObject

I'm trying to convert S4 to S3. Yes we could try to troubleshoot this. Thans a lot

vnmabus commented 3 years ago

I will need more info about the problem.

As a first step, I would try to print the obj variable for the S4 object, in order to know their internal representation.

Jorgelindo238 commented 3 years ago

Sorry, i'm new in programming. I have to plot on R or jupyter notebook ?

vnmabus commented 3 years ago

Ah, ok, maybe this is going to be too hard if you do not know Python yet... I will try to have a look later. I meant writing an additional elif branch:

elif obj.info.type == parser.RObjectType.S4: 
    print(obj)
Jorgelindo238 commented 3 years ago

I can manage this. So I have this back

AttributeError Traceback (most recent call last)

in 30 conversion_function: Callable[ 31 [Union[parser.RData, parser.RObject] ---> 32 ], Any]=lambda x: x 33 ) -> Union[Mapping[Union[str, bytes], Any], List[Any]]: 34 AttributeError: 'ArgumentParser' object has no attribute 'RObject' No problem to try later. Thanks
vnmabus commented 3 years ago

Please try installing the branch feature/s4_support and tell me if it works for you.

Jorgelindo238 commented 3 years ago

I have to install it on the env of the current folder ?

vnmabus commented 3 years ago

The environment is not associated with a folder. You need to uninstall the rdata package from your working environment:

pip uninstall rdata

and then install from the branch:

pip install git+https://github.com/vnmabus/rdata.git@feature/s4_support

Try it and tell me if this works for you.

Jorgelindo238 commented 3 years ago

It's installed but I have warnings *WARNING: Missing build requirements in pyproject.toml for git+https://github.com/vnmabus/rdata.git@feature/s4_support. WARNING: The project does not specify a build backend, and pip cannot fall back to setuptools without 'wheel'.

Let's try to open .rda data

Jorgelindo238 commented 3 years ago

No, it's not working but it's ok. I will try to export CSV from R. Thank you for the time you spent here !

Best

vnmabus commented 3 years ago

What problem did you have?

Jorgelindo238 commented 3 years ago

I tried with the previous code including :

elif obj.info.type == parser.RObjectType.S4: print(obj)

Problem with

import parser import RObject

Jorgelindo238 commented 3 years ago

I tried with the previous code including :

elif obj.info.type == parser.RObjectType.S4: 
    print(obj)

Problem with

import parser
 import RObject
Jorgelindo238 commented 3 years ago

Sorry I did a mistake closing...

vnmabus commented 3 years ago

Ok, I added support for more R objects (environments). If the dataset that you wanted to open was https://github.com/waldronlab/curatedOvarianData/blob/master/data/TCGA_eset.rda, then I think that now you can access the info with the feature/s4_support.

Please, check that all the info that you wanted is converted to the appropriate Python objects before I merge this functionality.

Jorgelindo238 commented 3 years ago

Hi ! I have synthax error using the upper code. Where parser is from ?

from .. import parser from ..parser import

Jorgelindo238 commented 3 years ago

When i try without the previous code I have this error NotImplementedError: Type RObjectType.ENV not implemented

vnmabus commented 3 years ago

You should reinstall the package from the branch:

pip uninstall rdata
pip install git+https://github.com/vnmabus/rdata.git@feature/s4_support

Then try:

import rdata

parsed = rdata.parser.parse_file("TCGA_eset.rda")
converted = rdata.conversion.convert(parsed)

Then, it will give you the following warnings:

/home/carlos/git/rdata/rdata/conversion/_conversion.py:632: UserWarning: Missing constructor for R class "Versions". The underlying R object is returned instead.
  stacklevel=1)
/home/carlos/git/rdata/rdata/conversion/_conversion.py:632: UserWarning: Missing constructor for R class "AnnotatedDataFrame". The underlying R object is returned instead.
  stacklevel=1)
/home/carlos/git/rdata/rdata/conversion/_conversion.py:632: UserWarning: Missing constructor for R class "MIAME". The underlying R object is returned instead.
  stacklevel=1)
/home/carlos/git/rdata/rdata/conversion/_conversion.py:632: UserWarning: Missing constructor for R class "ExpressionSet". The underlying R object is returned instead.
  stacklevel=1)

These appear because these classes do not have an associated Python object (you could program a constructor if you like, but it is not necessary: they simply will appear as primitive objects).

You can then access the object as you like:

converted["TCGA_eset"].featureData.data

And the output will be:

          probeset         gene
0      220951_s_at         A1CF
1        217757_at          A2M
2        221131_at        A4GNT
3        218075_at         AAAS
4      218434_s_at         AACS
...            ...          ...
13099  216014_s_at  ZXDB///ZXDA
13100  218639_s_at         ZXDC
13101  215706_x_at          ZYX
13102    212601_at        ZZEF1
13103    212893_at         ZZZ3

[13104 rows x 2 columns]
vnmabus commented 3 years ago

Tell me any issues that you encounter, in order to address them before merging.

Jorgelindo238 commented 3 years ago

YESS ! Thank you a lot man! Greatings !

vnmabus commented 3 years ago

I guess that worked for you?

Jorgelindo238 commented 3 years ago

Yes it's running. Thank you.