gao-lab / Cell_BLAST

A BLAST-like toolkit for large-scale scRNA-seq data querying and annotation.
http://cblast.gao-lab.org
MIT License
82 stars 13 forks source link

h5 file generation problem #3

Closed evenDDDDD closed 4 years ago

evenDDDDD commented 4 years ago

1.Why appears such an error when generating h5 file: "MY_ERROR: Error in CreateSeuratObject (raw.data = object @ exprs, meta.data = object @ obs): The parameter is not useful (raw.data = object @ exprs) \ n". I used the same data and R script in the collect part of your GitHub to run it. Seurat is also installed; 2、Is cell_ontology necessary? thank you!

Jeff1995 commented 4 years ago

Thanks for your interest! I guess it's a problem with incompatible Seurat versions (v3 changed the API significantly and is incompatible with v2). Our data collection scripts used Seurat v2.3.3. Could you confirm what Seurat version are you using?

evenDDDDD commented 4 years ago

Thank you very much for responding so quickly. I just checked the version of seurat and determined it was v3.1.1. Maybe I need to install seurat v2.3.3. And about another question, is the annotation of cell ontology necessary? Because my data may not get this information. Thanks again!

------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月12日(星期四) 晚上6:35 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3)

Thanks for your interest! I guess it's a problem with incompatible Seurat versions (v3 changed the API significantly and is incompatible with v2). Our data collection scripts used Seurat v2.3.3. Could you confirm what Seurat version are you using?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Jeff1995 commented 4 years ago

Okay, great! Switching to Seurat v2 should solve the problem. The cell ontology annotation is unnecessary. Just skip the "cell_ontology" argument when constructing the dataset, it should work fine.

evenDDDDD commented 4 years ago

Thank you for your answer. I solved the problem smoothly, but I encountered some other problems. Why did I generate the h5 file and train DIRECti model, there were no latent, tSNE and UMAP results in the h5 file; I also used the "inference" method. I tried a lot and no errors appeared, but it didn’t work. How do these results get into the h5 file??

I'm sorry if I disturbed you. ------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月12日(星期四) 晚上7:04 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3)

Okay, great! Switching to Seurat v2 should solve the problem. The cell ontology annotation is unnecessary. Just skip the "cell_ontology" argument when constructing the dataset, it should work fine.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Jeff1995 commented 4 years ago

Can you provide a code snippet to illustrate how you are using the model? To get the latent coordinates and write them to file, you can use data.latent = model.inference(data), and then call data.write_dataset("somefile.h5").

Some docs and examples can be found here:

Let me know if any further issues.

evenDDDDD commented 4 years ago

Sorry to disturb you again, I would like to ask you how to output "matplotlib.axes._subplots.AxesSubplot object" as a picture, including the tsne plot and the final comparison chart of cell blast. Because I really lack the knowledge of graphing with python. For example, when I run "ax = combined_dataset.visualize_latent ("study")" in Ipython, it only outputs "[Info] Computing tSNE ..." without pictures.  In addition, when using the "visualize_latent" method for visualization, the "cell ontology" information is missing. Can I use "cell type1" instead? Looking forward to your reply! ------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月14日(星期六) 晚上9:12 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3)

Can you provide a code snippet to illustrate how you are using the model? To get the latent coordinates and write them to file, you can use data.latent = model.inference(data), and then call data.write_dataset("somefile.h5").

Some docs and examples can be found here:

DIRECTi

ExprDataSet

Notebook

Let me know if any further issues.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Jeff1995 commented 4 years ago
  1. If you are running IPython with access to a graphical interface, or using Jupyter Notebook, the picture should appear automatically when tSNE computation is done (I personally have only used Jupyter Notebook though). Computing tSNE can take a long period of time if the number of cells is large. If that is the case, just wait a few moments and let it finish. If no graphical interface is available (e.g. running IPython over ssh), the picture would not appear. For matplotlib.axes._subplots.AxesSubplot objects, you may use ax.get_figure().savefig("file.pdf") to save it to a file, assuming ax is the returned Axes object. For the Sankey comparison plot, the cb.blast.sankey function returns a plotly dict. The picture should appear automatically if you use Jupyter Notebook, otherwise you may use plotly.io.write_image(d, "file.pdf"), assuming d is the returned plotly dict.
  2. Yes, you can use any type of annotation that you have, not limited to cell ontology.
evenDDDDD commented 4 years ago

Hi! Your work on defining cell types is excellent. I want to repeat your work, but I ran into a problem. These are my codes below: `expr_mat <- read.table("./p2_counts.txt",header = TRUE, row.names = 1) expr_mat1 <- as.matrix(expr_mat)

meta_df <- read.table("./p2_metadata.txt", header = TRUE, row.names = 1, sep='\t') colnames(meta_df) <- c("cell_type1")

cell_ontology <- read.csv("./p2_cell_ontology.csv", sep='\t') cell_ontology <- cell_ontology[, c("cell_type1", "cell_ontology_class", "cell_ontology_id")]

construct_dataset("./p22_10x/", as.matrix(expr_mat), meta_df, datasets_meta = NULL, cell_ontology)`

The error is: Error in validObject(.Object) : invalid class “ExprDataSet” object: FALSE

I can't understand what is happening, my installed seurat is v2.3.4 and R is 3.6.3. Look forward to your reply!

Jeff1995 commented 4 years ago

It is likely because meta_df differs from expr_mat in terms of row number and row names. Could you validate that:

nrow(expr_mat) == nrow(meta_df)
all(rownames(expr_mat) == rownames(meta_df))
evenDDDDD commented 4 years ago

Hi! Thank you very much for your reply! I modified my file to make sure that there is no case where the name does not match. The questions that have troubled me for a day are: 1、in the Query step, I used two different input files and reference for blast, but this error always appears when using the second file:

ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). I'm sure my second input file does not have any NaN. Is this error related to my reference or the input file that needs to be annotated?

2、I updated the latest version of Cell Blast AttributeError: 'BLAST' object has no attribute 'build_empirical' Below is my code:expr_mat <- read.table("gene_99_counts_2.txt",header = TRUE, row.names = 1) expr_mat1<-t(expr_mat) meta_df <- read.table("p2_metadata.txt", header = TRUE, row.names = 1, sep='\t') meta_df$region = "Hypothalamus" cell_ontology <- read.csv("p2_cell_ontology.csv", sep='\t') cell_ontology <- cell_ontology[, c("cell_type1", "cell_ontology_class", "cell_ontology_id")] construct_dataset("./p22_99/", as.matrix(expr_mat1), meta_df, datasets_meta = NULL, cell_ontology) blast = cb.blast.BLAST(models3, adata).build_empirical() tensorflow v1.8.0 cell blast v0.3.7 R3.6.3------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2020年7月7日(星期二) 晚上6:45 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3)

It is likely because meta_df differs from expr_mat in terms of row number and row names. Could you validate that: nrow(expr_mat) == nrow(meta_df) all(rownames(expr_mat) == rownames(meta_df))

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Jeff1995 commented 4 years ago
  1. My guess is that the expression matrix contains cells with all-zero expression, or contains negative values (the expression matrix should consist of non-negative raw UMI counts).

  2. The API has changed a little bit since v0.3. Please refer to the new tutorial here.