What would it take to use tricycle on species other than mouse and human?

Hsu-Che-Wei commented 1 year ago

Hi,

Thanks for the tool! I am interested in applying tricycle on plant single-cell data, and it seems that we can not rely on the reference provided. In that case, what would it take to train a reference for the plant?

Best, Che-Wei

kasperdanielhansen commented 1 year ago

Have you actually tried tricycle, in which case I would like to see the results?

On one hand, I would not be super surprised if plants does not work. On the other hand, I am pretty sure we have used it successfully for yeast, so I don't think it is out of the question that it works for plants. We have been quite surprised at how robust the reference is.

What you need is a dataset where the predominant variation is cell cycle - I would recommend proliferating cells in a dish. However, in our experience it is also critical to have replicates. Integrating >=2 replicates when constructing the reference really improves the signal. Our reference uses 2 replicates, and I am sure >2 is better than =2, but I don't know how much improvement you get by going beyond 2. I do know that 2 is much better than 1.

Best, Kasper

On Fri, Mar 24, 2023 at 1:55 AM Hsu-Che-Wei @.***> wrote:

Hi,

Thanks for the tool! I am interested in applying tricycle on plant single-cell data, and it seems that we can not rely on the reference provided. In that case, what would it take to train a reference for the plant?

Best, Che-Wei

— Reply to this email directly, view it on GitHub https://github.com/hansenlab/tricycle/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2DH7PJ6EDVSSU6HHXCULW5UZNLANCNFSM6AAAAAAWGD57U4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Best, Kasper

Hsu-Che-Wei commented 11 months ago

Hi Kasper,

Tricycle seems to work with plant root cells. However, the PC plot does not have a circular structure/shape as shown in your paper; instead, it has an L shape suggesting that the transition from S to G2M is quite abrupt. That is, there is.no smooth transition from the S and G2M phase. What do you think could be the issue here?

Thank you

Best, Che-Wei

rahulnutron commented 5 months ago

Dear Che-Wei, Have you able to use Tricycle for the plant dataset successfully? I am trying to do that, but not sure how to update the custom reference projection matrix for plant dataset. Could you help me on that?

Regards, Rahul

kasperdanielhansen commented 5 months ago

I see I did not reply to the plots posted a long time ago.

I think the plots suggests that the current approach is not working. This could either be because (a) the reference does not work in plants or (b) there is some mistake in the code or in matching the gene names between organisms. I would spend some time on (b) before I give up, especially on the gene name conversion. Having said that, we don't know if it works in plants.

To construct a new reference embedding you need a single-cell dataset, preferably with replicates (at least 2), where the only or major driver of expression is cell cycle. Preferably with few cell types. I am not super familiar with plants, but if I had 5m to design such an experiment I would take a cell culture with replicates, for example culture created from 2 different plants.

On Mon, Jan 15, 2024 at 7:52 AM Rahul Shaw @.***> wrote:

Dear Che-Wei, Have you able to use Tricycle for the plant dataset successfully? I am trying to do that, but not sure how to update the custom reference projection matrix for plant dataset. Could you help me on that?

Regards, Rahul

— Reply to this email directly, view it on GitHub https://github.com/hansenlab/tricycle/issues/11#issuecomment-1892121865, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2DHYWFWFF4CHSLEKLX7DYOURATAVCNFSM6AAAAAAWGD57U6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJSGEZDCOBWGU . You are receiving this because you commented.Message ID: @.***>

-- Best, Kasper

rahulnutron commented 5 months ago

Dear Dr. Hansen, Thank you for your reply. I was trying to do a trial run where I just copy and paste the embedding from mouse ref and replace with Arabidopsis gene. Like this head(copy_nf) pc1.rot pc2.rot ensembl symbol SYMBOL AT1G52070 -0.1989885 0.11245580 AT1G52070 AT1G52070 AT1G52070 AT2G36100 -0.1878494 0.04252486 AT2G36100 AT2G36100 AT2G36100 AT3G59370 -0.0567803 -0.07531686 AT3G59370 AT3G59370 AT3G59370 AT4G40090 -0.1662416 -0.15992459 AT4G40090 AT4G40090 AT4G40090 AT1G12090 -0.1721841 -0.11416861 AT1G12090 AT1G12090 AT1G12090 AT5G60530 -0.1357597 -0.19290262 AT5G60530 AT5G60530 AT5G60530

Then I get the normalized data from seurat

data <- GetAssayData(mz, slot = 'data',assay = 'SCT') However, when I run the below code, I got this error. ara_cc <- project_cycle_space(data,ref.m=copy_nf)

The number of projection genes found in the new data is 500. Error in scale(t(as.matrix(data.m)), center = TRUE, scale = FALSE) %*% : requires numeric/complex matrix/vector arguments

or this error when I convert to numeric

ara_cc <- project_cycle_space(as.numeric(data),ref.m=copy_nf) Error in .calProjection(data.m, ref.m) : None genes found in new data. This could be caused by wrong input of rownames type. In addition: Warning message: In .sparse2v(x) : sparse->dense coercion: allocating vector of size 1.9 GiB

Interestingly when I used given mouse data with "ref.m" argument , I got the same error

data(neuroRef) data(neurosphere_example) ara_cc <- project_cycle_space(neurosphere_example, ref.m = neuroRef) Error in .calProjection(data.m, ref.m) : None genes found in new data. This could be caused by wrong input of rownames type.

but without ref.m worked ara_cc <- project_cycle_space(neurosphere_example)

Does that mean, there is something wrong when ref.m is used, because even using provided neuroRef from the package didn't work? Any help is appreaciated.

Rahul

rahulnutron commented 5 months ago

Also while converting to matrix, also result the error. When I use single cell experiment data, also resulted an error.

ara_cc <- project_cycle_space(as.matrix(data),ref.m = copy_nf) The number of projection genes found in the new data is 500. Error in scale(t(as.matrix(data.m)), center = TRUE, scale = FALSE) %*% : requires numeric/complex matrix/vector arguments In addition: Warning message: In asMethod(object) : sparse->dense coercion: allocating vector of size 1.9 GiB

ara_cc <- project_cycle_space(sce,ref.m = copy_nf) The number of projection genes found in the new data is 500. Error in scale(t(as.matrix(data.m)), center = TRUE, scale = FALSE) %*% : requires numeric/complex matrix/vector arguments

hansenlab / tricycle

What would it take to use tricycle on species other than mouse and human? #11