raymondlouie / MiniMarS

4 stars 2 forks source link

subsamples from the processSubsampling function #25

Closed HsiaoChiLiao closed 1 year ago

HsiaoChiLiao commented 1 year ago

Hi Ray,

I installed the newest version (ver.: ‘0.1.2’) of the package and found the dimensions of the testing/training sets from processSubsampling seem to be incorrect.

>         dim(cluster_selection_out$matrix)
[1] 49057    97
>         final_out = processSubsampling(cluster_selection_out,
+                                        subsample_num=1000,
+                                        train_test_ratio = 0.9,
+                                        cluster_proportion= "proportional",
+                                        verbose=FALSE,
+                                        seed = 8)
There were 40 warnings (use warnings() to see them)
>         print(dim(final_out$training_matrix)) 
[1] 909  97
>         print(dim(final_out$test_matrix)) 
[1] 2424   97

From my previous runs, subsample_num meant the number of cells in both training and testing sets (i.e., # in training + # in testing ≈ 1000).

Thanks a lot!

raymondlouie commented 1 year ago

Sorry, fixed now.

HsiaoChiLiao commented 1 year ago

Thanks! Just got all analyses queuing/running on the server now. :)