cmap / cmapPy

Assorted tools for interacting with .gct, .gctx files and other Connectivity Map (Broad Institute) data/tools
https://clue.io/cmapPy/index.html
BSD 3-Clause "New" or "Revised" License
126 stars 76 forks source link

Convert .gctx file to .gct #47

Closed FarshidShekari closed 6 years ago

FarshidShekari commented 6 years ago

this is my code for that purpose:

import sys
from cmapPy.pandasGEXpress import gctx2gct

def main():
    gctx2gct.gctx2gct_main(sys.argv)

if __name__ == '__main__':
    main()

and when I want to run it from consol I get below error:

Traceback (most recent call last):
  File "gct2npy.py", line 25, in <module>
    main()
  File "gct2npy.py", line 8, in main
    gctx2gct.gctx2gct_main(sys.argv)
  File "C:\Users\Farshid\AppData\Local\Programs\Python\Python35\lib\site-package
s\cmapPy\pandasGEXpress\gctx2gct.py", line 51, in gctx2gct_main
    in_gctoo = parse_gctx.parse(args.filename, convert_neg_666=False)
AttributeError: 'list' object has no attribute 'filename'

indeed I want to write equal code for below code(is in deprecateversion of cmap) in cmappy:

import sys
import numpy as np
import cmap.io.gct as gct

def main():
    infile = sys.argv[1]
    outfile = sys.argv[2]

    gctobj = gct.GCT(infile)
    gctobj.read()

    data = gctobj.matrix[:, :].astype('float64')

    np.save(outfile, data)

if __name__ == '__main__':
    main()
oena commented 6 years ago

Hi @FarshidShekari, I'm not sure I totally follow your logic. gctx2gct is already included as a command-line method, so I'm not sure why you're writing your own. If you have cmapPy properly set up in your conda environment or installed, you can already run it as a command line script.

That being said, if you'd like to convert a file in a Python session, it's a two-liner: you can just parse the file in and then write it to whichever (gct, gctx) format would be most convenient.

FarshidShekari commented 6 years ago

I don't use Anaconda. I check gctx2gct in command line but not there is such command. When I want to parse the gctxfile, I got data frame error for that.

oena commented 6 years ago

Ah, ok. If you're not using the conda environment, you need to run it like a python script: ie. python gctx2gct.py -filename my/file/path

On Thu, Nov 1, 2018 at 3:27 PM Farshid Shekari notifications@github.com wrote:

I don't use Anaconda. I check gctx2gct in command line but not there is such command. When I want to parse the gctx file, I got data frame error for that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapPy/issues/47#issuecomment-435156785, or mute the thread https://github.com/notifications/unsubscribe-auth/AHs471i3utNji1DPgIJrmxFHVObO1Wcoks5uq0shgaJpZM4YEOqy .

FarshidShekari commented 6 years ago

I mentioned in above, I write my code in a file like below

import sys
from cmapPy.pandasGEXpress import gctx2gct

def main():
    gctx2gct.gctx2gct_main(sys.argv)

if __name__ == '__main__':
    main()

and saved in a file with name ab.py. When I want to run it with this command python ab.py zzzz.gctx jjjj.gct in cmd I get the mentioned error. How handle it?

oena commented 6 years ago

You can just import cmapPy.pandasGEXpress.parse and cmapPy.pandasGEXpress.write_gct instead, and add two lines instead of the one you currently have in your main method to (1) parse in the gctx and then (2) write it out to gct.

On Thu, Nov 1, 2018 at 6:28 PM Farshid Shekari notifications@github.com wrote:

I mentioned in above, I write my code in a file like below

import sys from cmapPy.pandasGEXpress import gctx2gct

def main(): gctx2gct.gctx2gct_main(sys.argv)

if name == 'main': main()

and saved in a file with name ab.py. When I want to run it with this command python ab.py zzzz.gctx jjjj.gct in cmd I get the mentioned error. How handle it?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapPy/issues/47#issuecomment-435208426, or mute the thread https://github.com/notifications/unsubscribe-auth/AHs4750Y8wkb81zJW1apmkId5vFoGyGTks5uq3WLgaJpZM4YEOqy .

FarshidShekari commented 6 years ago

I changed my main method code to gctx = parse.parse("bgedv2_QNORM.gctx") but when I want to run it I get the memory error. How I can handle it? (my file size about 11G)

oena commented 6 years ago

If you're running out of memory that's a hardware issue, not one with the cmapPy package. You can either read in only a subset of the file into memory using hyperslab selection (the documentation and tutorials both provide details on how to do this), or you can try using your script on an alternate system.

On Fri, Nov 2, 2018 at 2:16 PM Farshid Shekari notifications@github.com wrote:

I changed my main method code to gctx = parse.parse("bgedv2_QNORM.gctx") but when I want to run it I get the memory error. How I can handle it? (my file size about 11G)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapPy/issues/47#issuecomment-435435531, or mute the thread https://github.com/notifications/unsubscribe-auth/AHs4796H3W7KBu4U40ROWXGmUXp62Mh_ks5urIvugaJpZM4YEOqy .