atakanekiz / CIPR-Package

Cluster Identity Predictor (R package implementation)
18 stars 3 forks source link

How to run CIPR for 2 objects in the same global environment? #3

Open denvercal1234GitHub opened 3 years ago

denvercal1234GitHub commented 3 years ago

Dear @atakanekiz,

Thanks for a great package! Would you mind helping me address the questions below, when you get a chance?

  1. I have 2 cluster marker objects I aim to run CIPR for comparison, but when I run CIPR on these 2 objects, in my Global Environment, the outputs for the first object always get overridden by those of the second object.

Would you mind letting me know what is the best way to differentiate them? I try to assign to different object names (i.e., object 1 <- CIPR(...), but it does not work.

  1. How to make the points in the in_clu_plots smaller? I tried calling scale_size_manual() but it did not work.

  2. In the CIPR_top_results dataframe, why the predictions for cluster 2, for instance, are way down in the table (having higher index, which I assume just numerical order?) relative to the other clusters (i.e., they are not in order clusters 1, 2, 3, ....)? I see that within a cluster, the top "long_name" is ordered by identity_score, by z_score, and then by percent_pos_correlation.

  3. Is there a way to obtain a list of matched genes between the query cluster and the ref cell type used to generate the prediction?

Thank you again for your help!

atakanekiz commented 3 years ago

Hey there! Thanks for using the package. I will come up with more elegant solutions and in-depth answers when I find some time, but these should help for the time being.

1- This is a design flaw really (and I don't think the best coding practice). I will fix this later, but meanwhile, you can assign the CIPR_results object that gets generated when you run CIPR() as follows: first_analysis <- CIPR_results. That way you will retain the first analysis results.

2- I can't remember off the top of my head, but you can change relevant parts of the plot object (which is essentially a list). Check out this SO thread. There are others questions like this that might be more relevant for our case here, but it should get you started. The other solution may be saving the graph as a larger size which makes dots relatively smaller.

3- I will have to look at this closer, but I assume it is because, from the computer point of view, the order for names goes like this 0, 1, 10 ,11 , 100, 1000, etc, 2 since they are not recognized as numerics. If you order "character" type numbers, this can happen. If I'm right, I will think of a solution for this soon.

4- CIPR-Shiny implementation has a code window that reports what you are looking for. In the package implementation, the easiest would be probably accessing the data files of the package and comparing them with the experimental data set you have. Something like this probably:

common_genes <- intersect(CIPR::immgen_expr$GeneName, your_data_frame$genename)

I hope this helps. Let me know if you have other questions.

Atakan

denvercal1234GitHub commented 3 years ago

Dear @atakanekiz,

Thank you very much for your suggestions! These are very helpful. Looking forward to the next update whenever that will be. 😁

  1. I wanted to confirm you meant the CIPR_top_results and CIPR_all_results objects which are the outputs of CIPR (and not CIRP_results object which is not found)?

  2. I think you are right regarding the order. I just wanted to make sure there was not a particular ranking behind the scene that results in the observed ordering.

  3. I checked the Analysis details tab of https://aekiz.shinyapps.io/CIPR/, but it only reports the number of genes shared. Did you mean to recommend me to look elsewhere on that site?