maximilianh / cellBrowser

main repo: https://github.com/ucscGenomeBrowser/cellBrowser/ - Python pipeline and Javascript scatter plot library for single-cell datasets, http://cellbrowser.rtfd.org
https://github.com/ucscGenomeBrowser/cellBrowser/
GNU General Public License v3.0
104 stars 41 forks source link

python >= 3.9 compatibility (tostring -> tobytes) #239

Closed kriemo closed 2 years ago

kriemo commented 2 years ago

The following error occurs when calling cbBuild with python v.3.10:

INFO:root:Auto-detect: Numbers in matrix are of type 'float' ERROR:root:Unexpected error: (<class 'AttributeError'>, AttributeError("'array.array' object has no attribute 'tostring'"), <traceback object at 0x103f1ffc0>) Traceback (most recent call last): File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 4783, in cbBuildCli build(confFnames, outDir, port, redo=options.redo) File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 4598, in build convertDataset(inDir, inConf, outConf, datasetDir, redo) File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 3955, in convertDataset convertExprMatrix(inConf, outMatrixFname, outConf, sampleNames, geneToSym, datasetDir, needFilterMatrix) File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 3297, in convertExprMatrix matType = matrixToBin(outMatrixFname, geneToSym, binMat, binMatIndex, discretBinMat, discretMatrixIndex, metaSampleNames, matType=matType, genesAreRanges=genesAreRanges) File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 2052, in matrixToBin exprStr, minVal = exprEncode(geneId, exprArr, matType) File "/Users/kriemo/.local/pipx/venvs/cellbrowser/lib/python3.10/site-packages/cellbrowser/cellbrowser.py", line 1934, in exprEncode exprStr = array.array(arrType, exprArr).tostring() AttributeError: 'array.array' object has no attribute 'tostring'

The tostring() method for array.array objects has been removed in python >= v3.9. This PR substitutes in the alias method tobytes(), which fixes the error, and is backwards compatible with python >= v3.2.

https://docs.python.org/3.9/whatsnew/3.9.html#removed

maximilianh commented 2 years ago

Oh awesome, thanks!

also, I'm curious where you're using the cell browser and why. Is your website publicly accessible?

kriemo commented 2 years ago

Thanks for providing and maintaining such a useful tool. I work with a team that often analyzes single cell datasets (https://github.com/rnabioco). We've used the cellbrowser to host private browsers for collaborators (via aws s3) or public browsers for published datasets. The cellbrowser has been extremely helpful for our collaborators without analysis/coding experience who may not be able to set up the necessary software to run a command line or R based visualization tool.

Here are two public sites: https://www.pneuroonccellatlas.org/ https://github.com/rnabioco/lung-scrna

maximilianh commented 2 years ago

Thanks for the links! I added them to our README.

If you want us to host some of your datasets that could be of broad interest, let us know.

On Thu, Mar 31, 2022 at 7:03 PM Kent Riemondy @.***> wrote:

Thanks for providing and maintaining such a useful tool. I work with a team that often analyzes single cell datasets (https://github.com/rnabioco). We've used the cellbrowser to host private browsers for collaborators (via aws s3) or public browsers for published datasets. The cellbrowser has been extremely helpful for our collaborators without analysis/coding experience who may not be able to set up the necessary software to run a command line or R based visualization tool.

Here are two public sites: https://www.pneuroonccellatlas.org/ https://github.com/rnabioco/lung-scrna

— Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/239#issuecomment-1084858083, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TOTH6BORJJMRTWSFXDVCXLF3ANCNFSM5SDJJ3UQ . You are receiving this because you modified the open/close state.Message ID: @.***>

tkakar commented 2 years ago

@maximilianh Sorry for commenting on a closed issue, but I got this error today on line 1958. I commented line 1958 and the error is resolved. My python version is Python 3.9.12.

    `line 1958 exprStr = array.array(arrType, exprArr).tostring()
    # Python 3.9 removed tostring()
    if sys.version_info >= (3, 2):
        exprStr = array.array(arrType, exprArr).tobytes()
    else:
        exprStr = array.array(arrType, exprArr).tostring()`
maximilianh commented 2 years ago

Hi @tkakar, thanks for commenting on a closed issue. :-)

Do you mean that you hadn't updated the cell browser yet to the new version? Or do you mean that you had updated it and there is problem in our new code that tries to avoid this issue and you're on Python 3.9.12 and somehow the line with tostring() is run?

tkakar commented 2 years ago

@maximilianh Thanks for the prompt response. I mean I installed the cellbrowser today (I assume it's updated?), and got this error. If the installation does not provide updated version, how can I update it. Many thanks.

maximilianh commented 2 years ago

Hm, which version did you install? Can you tell us what "pip show cellbrowser" says?

On Tue, May 31, 2022 at 5:22 PM Tabassum Kakar @.***> wrote:

@maximilianh https://github.com/maximilianh Thanks for the prompt response. I mean I installed the cellbrowser today (I assume it's updated?), and got this error. If the installation does not provide updated version, how can I update it. Many thanks.

— Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/239#issuecomment-1142279876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TLYPZ5TRYFUU6M5FBLVMYVENANCNFSM5SDJJ3UQ . You are receiving this because you were mentioned.Message ID: @.***>

tkakar commented 2 years ago

Here is the output Name: cellbrowser Version: 1.2.0 Summary: UCSC Cellbrowser, an interactive browser for single cell data. Includes converters and basic pipelines for text files, Seurat, Scanpy and Cellranger. Home-page: https://github.com/maximilianh/cellBrowser Author: Maximilian Haeussler Author-email: max@soe.ucsc.edu License: GPL 3 Location: /opt/homebrew/lib/python3.9/site-packages

matthewspeir commented 2 years ago

Do you also have numpy installed? I seem to remember getting that error with the cell browser in an environment where I didn't have numpy installed.

tkakar commented 2 years ago

@matthewspeir thank you. I did not have numpy, but installing numpy and reinstalling the cellbrowser solved the issue. Many thanks. Might be helpful to add it to the readme for someone else having similar issue.

maximilianh commented 2 years ago

This is a bug and a very clear one. Sorry to both of you.

When I added the code to fix the .tostring() issue, I got a merge conflict and didn't remove the old line. So now the code is there twice. I think this breaks the script for all users where numpy is not installed.

We may need to merge the bugfix commit 830f74a and release again.

Also, the documentation should strongly encourage installing numpy. I am working around the absence of numpy with strange hacks and that makes the code more and more complicated. It's working without numpy but the data conversion is a lot slower.

On Tue, May 31, 2022 at 5:33 PM Tabassum Kakar @.***> wrote:

@matthewspeir https://github.com/matthewspeir thank you. I did not have numpy, but installing numpy and reinstalling the cellbrowser solved the issue. Many thanks. Might be helpful to add it to the readme for someone else having similar issue.

— Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/239#issuecomment-1142292108, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TMEGEKCIYQIZSFZE2DVMYWKZANCNFSM5SDJJ3UQ . You are receiving this because you were mentioned.Message ID: @.***>

matthewspeir commented 2 years ago

I think numpy should only be necessary if you're working with mtx files (which I think we mention in the installation instructions on readthedocs), but maybe that's changed and it's necessary elsewhere now @maximilianh ?

matthewspeir commented 2 years ago

Ah, never mind. Thanks for the clarification, Max!

maximilianh commented 2 years ago

It's not "necessary" for other files. But if it's there, it's used in a lot of places, as the numpy arrays have a method to write their contents to text files which is three times faster than the manual "write number, write tab, writer number, write tab" loop a million times.

This makes the code complicated, because a lot of places say "if numpy: do this otherwise: do this".

I have a feeling that numpy is so standard these days that we I don't need to make it work without numpy? But I may be wrong.

On Tue, May 31, 2022 at 5:39 PM Matt Speir @.***> wrote:

I think numpy should only be necessary if you're working with mtx files (which I think we mention in the installation instructions on readthedocs), but maybe that's changed and it's necessary elsewhere now @maximilianh https://github.com/maximilianh ?

— Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/239#issuecomment-1142298837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACL4TLOD3URU57PJTVYCBTVMYXB7ANCNFSM5SDJJ3UQ . You are receiving this because you were mentioned.Message ID: @.***>