almaan / stereoscope

Spatial mapping of cell types by integration of transcriptomics data
MIT License
87 stars 26 forks source link

ValueError: could not convert string to float: 'AAACAAGTATCTCCCA-1' #2

Closed cartal closed 4 years ago

cartal commented 4 years ago

Hi @almaan!

First of all, thanks for sharing this fantastic work! :D

I have been trying to run stereoscope on my data, and the stereoscope run part runs smoothly. However, when I'm trying to plot the results I run into an annoying ValueError: could not convert string to float: 'AAACAAGTATCTCCCA-1' error. But after checking the output file, I don't see any potential errors (i.e: no spaces, tabs or anything suspicious.) Since this is the output from the first step I would think it is in the right format.

The command I'm trying looks like this:

stereoscope look -pp W.2020-04-27172618.217811.tsv -o viz -sc i -sb s -nc 7 -c "umap" -g -ms 40

The error I get is this:

Traceback (most recent call last):
  File "/Users/ctl/.local/bin/stereoscope", line 11, in <module>
    load_entry_point('STereoSCope==0.2', 'console_scripts', 'stereoscope')()
  File "/Users/ctl/.local/lib/python3.7/site-packages/STereoSCope-0.2-py3.7.egg/stsc/__main__.py", line 16, in main
    look(args)
  File "/Users/ctl/.local/lib/python3.7/site-packages/STereoSCope-0.2-py3.7.egg/stsc/look.py", line 322, in look
    crdlist = [get_crd(w) for w in wlist]
  File "/Users/ctl/.local/lib/python3.7/site-packages/STereoSCope-0.2-py3.7.egg/stsc/look.py", line 322, in <listcomp>
    crdlist = [get_crd(w) for w in wlist]
  File "/Users/ctl/.local/lib/python3.7/site-packages/STereoSCope-0.2-py3.7.egg/stsc/look.py", line 259, in get_crd
    crd = np.array(crd).astype(float)
ValueError: could not convert string to float: 'AAACAAGTATCTCCCA-1'

Do you think this has to do with a different in packages versions? I'm using pandas 0.23.4 and numpy 1.16.1.

Looking forward to hearing from you.

Regards

almaan commented 4 years ago

Hi @cartal! Thrilled to hear that you find stereoscope useful!

I would guess that the issue here is that you have barcodes rather than spatial coordinates as indices in your spatial data, and as a consequence in the W-files (proportion estimates). In order for the visualization ( look module) to work smoothly, the fastest solution is to change the indices to a format [x_coordinate]x[y_coordinate]. To clarify one row name could for example be "2358x12812".

I'm guessing, from your error message, that you are working with Visium data; hence this should be relatively easily fixed by fetching the coordinates from the tissue_positions_list.csv file found in the spatial folder that spaceranger provides - and then update your row-names in the W-files accordingly. If I may give a recommendation, it would be to use the pixel coordinates and not array coordinates.

This a design flaw from my side - I will try to push some updates that allow you to provide a coordinate file so you don't have to this manually, or alternatively provide you with a script for this purpose. However, I cannot say exactly when this will happen, thus why I believe it's faster for you to change your files rather than waiting for me.

Sorry for any inconvenience, and let me know if the issue persists!

cartal commented 4 years ago

Hi @almaan,

Thanks for the quick reply and helpful advice. After fixing it the way you describe it the command runs without errors, but then the folder is empty :o

This happens in a Linux machine and my local iMac.

Any ideas why this may be?

almaan commented 4 years ago

Glad to hear it worked!

I believe you might have spotted a bug here; which meant the lookmodule did not render a proper name of the files if the command was executed from the same folder as the proportion estimate files are located within. I managed to replicate this error, and just pushed a change which should fix it.

Thanks for letting me know, and if this does not resolve the issue for you, please do tell.

cartal commented 4 years ago

Fantastic!

Many thanks, it is working now.