Closed carlocolantuoni closed 3 months ago
Found error with a sample name - 1dayaftersham. - "Count not convert string to float". If I look at the first sample line "GSE35338_Biomat_20___BioAssayId=165640Name=Astrocyte,1dayaftersham,biologicalrep4",253.03062
it seems to me that the scanpy read_csv
function is ignoring the quotes in the sample name, then breaking that up. Scanpy's function has it set to treat the second column onwards as a float (since it should be data).
I feel like we have observed this before, but I need to see if it is in past emails perhaps. It may have been with using dots in the sample name.
There is also a "failed with error status 500" dataset towards the bottom... explanation below
Looks like that particular default curation that is trying to be plotted is using a saved analysis but the "user_saved" analysis directory (and file) for this dataset is not on the filesystem. @jorvis, I'm not sure how you created this VM but is there a chance some user_saved analysis did not transfer over?
@carlocolantuoni if you are pressed for time, maybe for this dataset you can curate a new dataset curation using the primary analysis default, and when you save, check the "make default display" option
I tested reading the CSV file in the python REPL to see if I can reproduce without all the extraneous code, and I got the same error. I'm going to create a ticket in the Anndata github repo
thnx - i can make another view and set as default as u suggest while you work on it - thnx
making the new view didnt seem to help.
btw - there is no prob in gene viewing, just projection - in the past i know i have seen successful plots of projection in this dataset, so as u say shaun, it might be related to what did and did not get transferred
making the new view didnt seem to help.
btw - there is no prob in gene viewing, just projection - in the past i know i have seen successful plots of projection in this dataset, so as u say shaun, it might be related to what did and did not get transferred
I think you did the wrong dataset. I was referring to the "Mouse (6 months), scRNA-seq, immune cells from whole brains of AD model (5xFAD) (Amit)" dataset that is missing the saved analysis file, not the first discussed dataset. The Amit dataset shows the same error on expression and projection views.
o - ok.. when i try to curate a view for that one, there are no options in the metadata pulldown menus - only "expression"
Oh ok, then that means the current default curation was using metadata from the saved analysis then.
also there is a ticket created in the "anndata" repo for the other dataset issue
Hi, since the CSV format isn’t a standard and rather “whatever the developer of whatever language or framework felt like at the time”, there are no CSV parsing/writing bugs, just choices.
If you want to be able to rely on reading a file you wrote, avoid CSV/TSV/… as an intermediary format.
If you know exactly what CSV settings you need to read a specific CSV file, use pandas’ read_csv
function and the AnnData constructor to create an object from the expression/metadata parts of the data frame.
https://github.com/scverse/anndata/issues/1573 will not be worked on. So two things need to happen.
Hi, since the CSV format isn’t a standard and rather “whatever the developer of whatever language or framework felt like at the time”, there are no CSV parsing/writing bugs, just choices.
If you want to be able to rely on reading a file you wrote, avoid CSV/TSV/… as an intermediary format.
If you know exactly what CSV settings you need to read a specific CSV file, use pandas’
read_csv
function and the AnnData constructor to create an object from the expression/metadata parts of the data frame.
Just saw your reply (was en route to work so I missed it)... the original use-case was take a valid Anndata object and replace the data in adata.X and adata.var with the data from the CSV. My strategy was to use Anndata.read_csv
to populate the X, and replace our "adata.var" contents with the columns from the CSV.
I'll try the pandas read_csv
and Anndata constructor and see if this is a simpler solution than what I was proposing in my previous comment. Thanks for the info @flying-sheep!
The previous commit resolved the issue @carlocolantuoni was having
code here
import anndata
import pandas as pd
dataset_adata = <current adata to replace data from>
# READ CSV to make X and var
df = pd.read_csv(projection_csv_path, sep=',', index_col=0, header=0)
X = df.to_numpy()
var = pd.DataFrame(index=df.columns)
obs = dataset_adata.obs
obsm = dataset_adata.obsm
# Create the anndata object and write to h5ad
# Associate with a filename to ensure AnnData is read in "backed" mode
projection_adata = anndata.AnnData(X=X, obs=obs, var=var, obsm=obsm, filename=projection_adata_path, filemode='r')
# For some reason the gene_symbol is not taken in by the constructor
projection_adata.var["gene_symbol"] = projection_adata.var_names
### use projection_adata downstream in place of dataset_adata
Once again, thanks for the suggestion and assistance @flying-sheep
great, happy to be of assistance!
Linking #623 for the second dataset where analysis is missing (presumably deleted)
thnx guys!
On Mon, Aug 5, 2024 at 12:03 PM Shaun Adkins @.***> wrote:
Linking #623 https://github.com/IGS/gEAR/issues/623 for the second dataset where analysis is missing (presumably deleted)
— Reply to this email directly, view it on GitHub https://github.com/IGS/gEAR/issues/847#issuecomment-2269416910, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7KC7UPARO3Z7V5WZ3NUMTZP6O5RAVCNFSM6AAAAABL6FLKM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRZGQYTMOJRGA . You are receiving this because you were mentioned.Message ID: @.***>
-- Carlo
when i run this projection: https://nemoanalytics.org/projection.html?projection_algorithm=pca&multipattern_plots=0&projection_source=f61159d5&layout_id=f1b93141&projection_patterns=FC
for this dataset: "Expression data from reactive astrocytes acutely purified from young adult mouse brains"
i am getting this error: "Could not create projection AnnData object from CSV."
can you figre out why?