ebecht / infinityFlow

25 stars 8 forks source link

Error in using GPU Tensorflow Mac M1 when doing neural network model #13

Closed denvercal1234GitHub closed 1 year ago

denvercal1234GitHub commented 1 year ago

Hi there,

When I followed the "training_non_default_regression_models" tutorial and sample data as below, I encountered errors as shown.

Would you mind helping me to fix the issue in R environment, because it does not seem that my TensorFlow is actually using GPU?

Thank you for your help.

regression_functions <- list(
    XGBoost = fitter_xgboost, # XGBoost
    ## Passed to fitter_nn, e.g. neural networks through keras::fit. See https://keras.rstudio.com/articles/tutorial_basic_regression.html
NN = fitter_nn,
    SVM = fitter_svm, # SVM
    LASSO2 = fitter_glmnet, # L1-penalized 2nd degree polynomial model
    LM = fitter_linear # Linear model
)

extra_args_regression_params <- list(
     ## Passed to the first element of `regression_functions`, e.g. XGBoost. See ?xgboost for which parameters can be passed through this list
    list(nrounds = 500, eta = 0.05),

    # ## Passed to the second element of `regression_functions`, e.g. neural networks through keras::fit. See https://keras.rstudio.com/articles/tutorial_basic_regression.html
    #MacOS with AMD GPU here. I am using tensorflow for metal as soon as it was launched, with GPU acceleration. Sometimes I get the same message (Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.). But, it still uses the GPU. You can check that by opening Activity Monitor, then pressing Cmd + 3 and Cmd + 4, which shows you GPU and CPU usage.
      list(
        object = { ## Specifies the network's architecture, loss function and optimization method
        model = keras_model_sequential()
        model %>%
        layer_dense(units = backbone_size, activation = "relu", input_shape = backbone_size) %>% 
        layer_dense(units = backbone_size, activation = "relu", input_shape = backbone_size) %>%
        layer_dense(units = 1, activation = "linear")
        model %>%
        compile(loss = "mean_squared_error", optimizer = optimizer_sgd(lr = 0.005))
        serialize_model(model)
        },
        epochs = 1000, ## Number of maximum training epochs. The training is however stopped early if the loss on the validation set does not improve for 20 epochs. This early stopping is hardcoded in fitter_nn.
        validation_split = 0.2, ## Fraction of the training data used to monitor validation loss
        verbose = 0,
        batch_size = 128 ## Size of the minibatches for training.
    ),
    # Passed to the third element, SVMs. See help(svm, "e1071") for possible arguments
    list(type = "nu-regression", cost = 8, nu=0.5, kernel="radial"),

    # Passed to the fourth element, fitter_glmnet. This should contain a mandatory argument `degree` which specifies the degree of the polynomial model (1 for linear, 2 for quadratic etc...). Here we use degree = 2 corresponding to our LASSO2 model Other arguments are passed to getS3method("cv.glmnet", "formula"),
    list(alpha = 1, nfolds=10, degree = 2),

    # Passed to the fifth element, fitter_linear. This only accepts a degree argument specifying the degree of the polynomial model. Here we use degree = 1 corresponding to a linear model.
    list(degree = 1)
)

Output

Metal device set to: Apple M1 Max

systemMemory: 64.00 GB
maxCacheSize: 24.00 GB

2023-01-30 14:04:18.688257: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-01-30 14:04:18.688306: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
WARNING:absl:`lr` is deprecated, please use `learning_rate` instead, or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.SGD.
[[1]]
[[1]]$nrounds
[1] 500

[[1]]$eta
[1] 0.05

[[2]]
[[2]]$object
   [1] 89 48 44 46 0d 0a 1a 0a 00 00 00 00 00 08 08 00 04 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff
  [36] ff ff ff ff ff 98 54 00 00 00 00 00 00 ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 60 00 00 00 00 00
  [71] 00 00 01 00 00 00 00 00 00 00 88 00 00 00 00 00 00 00 a8 02 00 00 00 00 00 00 01 00 07 00 01 00 00 00 18
 [106] 00 00 00 00 00 00 00 10 00 10 00 00 00 00 00 20 03 00 00 00 00 00 00 68 01 00 00 00 00 00 00 54 52 45 45
 [141] 00 00 01 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 18 00 00 00 00 00
 [176] 00 18 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [211] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [246] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [281] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [316] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [351] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [386] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [421] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [456] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [491] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [526] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [561] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [596] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [631] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 [666] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 48 45 41 50 00 00 00 00 58 00 00 00 00 00 00 00 30 00 00 00
 [701] 00 00 00 00 c8 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6d 6f 64 65 6c 5f 77 65 69 67 68 74 73 00 00
 [736] 00 6f 70 74 69 6d 69 7a 65 72 5f 77 65 69 67 68 74 73 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 28 00
 [771] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 11 00 10 00 00
 [806] 00 00 00 88 00 00 00 00 00 00 00 a8 02 00 00 00 00 00 00 0c 00 48 00 04 00 00 00 01 00 0e 00 14 00 08 00
 [841] 6b 65 72 61 73 5f 76 65 72 73 69 6f 6e 00 00 00 19 01 01 00 10 00 00 00 10 00 00 00 01 00 00 00 00 00 08
 [876] 00 00 00 00 00 01 00 00 00 00 00 00 00 06 00 00 00 00 08 00 00 00 00 00 00 01 00 00 00 0c 00 40 00 04 00
 [911] 00 00 01 00 08 00 14 00 08 00 62 61 63 6b 65 6e 64 00 19 01 01 00 10 00 00 00 10 00 00 00 01 00 00 00 00
 [946] 00 08 00 00 00 00 00 01 00 00 00 00 00 00 00 0a 00 00 00 00 08 00 00 00 00 00 00 02 00 00 00 0c 00 48 00
 [981] 04 00 00 00 01 00 0d 00 14 00 08 00 6d 6f 64 65 6c 5f 63 6f
 [ reached getOption("max.print") -- omitted 20656 entries ]

[[2]]$epochs
[1] 1000

[[2]]$validation_split
[1] 0.2

[[2]]$verbose
[1] 0

[[2]]$batch_size
[1] 128

[[3]]
[[3]]$type
[1] "nu-regression"

[[3]]$cost
[1] 8

[[3]]$nu
[1] 0.5

[[3]]$kernel
[1] "radial"

[[4]]
[[4]]$alpha
[1] 1

[[4]]$nfolds
[1] 10

[[4]]$degree
[1] 2

[[5]]
[[5]]$degree
[1] 1
if(length(regression_functions) != length(extra_args_regression_params)){
    stop("Number of models and number of lists of hyperparameters mismatch")
}
imputed_data <- infinity_flow(
    regression_functions = regression_functions,
    extra_args_regression_params = extra_args_regression_params,
    path_to_fcs = path_to_fcs,
    path_to_output = path_to_output,
    path_to_intermediary_results = path_to_intermediary_results,
    backbone_selection_file = backbone_selection_file,
    annotation = targets,
    isotype = isotypes,
    input_events_downsampling = input_events_downsampling,
    prediction_events_downsampling = prediction_events_downsampling,
    verbose = TRUE,
    #Note: there is an issue with serialization of the neural networks and socketing since I updated to R-4.0.1. If you want to use neural networks, please make sure to set cores = 1L
    cores = cores,
    neural_networks_seed = 12345
)

Output

Using directories...
    input: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/fcs
    intermediary: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/tmp
    subset: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/tmp/subsetted_fcs
    rds: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/tmp/rds
    annotation: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/tmp/annotation.csv
    output: /Users/clusteredatom/Documents/DPHIL_DATA/scRNAseq/T230T240T246_CXCR5Project/scRNAseq_Analysis_Scripts_2022Nov22/HD_Flow/Infinity_Flow/basic_usage_tutorial/infinity_flow_example/output
Parsing and subsampling input data
    Downsampling to 1000 events per input file
    Concatenating expression matrices
    Writing to disk
Logicle-transforming the data
    Backbone data
    Exploratory data
    Writing to disk
    Transforming expression matrix
    Writing to disk
Harmonizing backbone data
    Scaling expression matrices
    Writing to disk
Fitting regression models
    Randomly selecting 50% of the subsetted input files to fit models
    Fitting...
        XGBoost

  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=06s  
    6 seconds
        NN

  |                                                  | 0 % ~calculating  2023-01-30 14:12:34.372079: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-01-30 14:12:34.481941: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-01-30 14:12:35.277328: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
2023-01-30 14:12:35.277355: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
2023-01-30 14:12:35.461522: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
2023-01-30 14:12:35.461546: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
2023-01-30 14:12:35.465332: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
2023-01-30 14:12:35.465353: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b93988d0
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:
<... omitted ...>ages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/Users/clusteredatom/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
      distribution.extended.update(
    File "/Users/clusteredatom/Library/r-miniconda-arm64/envs/r-reticulate/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
      return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_4'
could not find registered platform with id: 0x2b93988d0
     [[{{node StatefulPartitionedCall_4}}]] [Op:__inference_train_function_609]
See `reticulate::py_last_error()` for details
> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

Random number generation:
 RNG:     L'Ecuyer-CMRG 
 Normal:  Inversion 
 Sample:  Rejection 

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tensorflow_2.11.0    keras_2.11.0         e1071_1.7-12         glmnetUtils_1.1.8    infinityFlow_1.3.1   flowCore_2.10.0      openxlsx_4.2.5.1    
 [8] readxl_1.4.1         stringr_1.5.0        ggplot2_3.4.0        patchwork_1.1.2.9000 SeuratObject_4.1.3   Seurat_4.3.0         dplyr_1.0.10        

loaded via a namespace (and not attached):
  [1] plyr_1.8.8             igraph_1.3.5           lazyeval_0.2.2         sp_1.6-0               splines_4.2.2          listenv_0.9.0         
  [7] scattermore_0.8        tfruns_1.5.1           digest_0.6.31          foreach_1.5.2          htmltools_0.5.4        fansi_1.0.4           
 [13] magrittr_2.0.3         tensor_1.5             cluster_2.1.4          ROCR_1.0-11            globals_0.16.2         matrixStats_0.63.0    
 [19] spatstat.sparse_3.0-0  cytolib_2.10.1         colorspace_2.1-0       ggrepel_0.9.2          xfun_0.36              jsonlite_1.8.4        
 [25] progressr_0.13.0       spatstat.data_3.0-0    zeallot_0.1.0          survival_3.5-0         zoo_1.8-11             iterators_1.0.14      
 [31] glue_1.6.2             polyclip_1.10-4        gtable_0.3.1           leiden_0.4.3           future.apply_1.10.0    shape_1.4.6           
 [37] BiocGenerics_0.44.0    abind_1.4-5            scales_1.2.1           DBI_1.1.3              spatstat.random_3.1-3  miniUI_0.1.1.1        
 [43] Rcpp_1.0.10            viridisLite_0.4.1      xtable_1.8-4           reticulate_1.27        matlab_1.0.4           proxy_0.4-27          
 [49] stats4_4.2.2           glmnet_4.1-6           htmlwidgets_1.6.1      httr_1.4.4             RColorBrewer_1.1-3     ellipsis_0.3.2        
 [55] ica_1.0-3              pkgconfig_2.0.3        sass_0.4.5             uwot_0.1.14            deldir_1.0-6           utf8_1.2.2            
 [61] here_1.0.1             tidyselect_1.2.0       rlang_1.0.6            reshape2_1.4.4         later_1.3.0            cachem_1.0.6          
 [67] munsell_0.5.0          cellranger_1.1.0       tools_4.2.2            xgboost_1.7.3.1        cli_3.6.0              generics_0.1.3        
 [73] ggridges_0.5.4         evaluate_0.20          fastmap_1.1.0          yaml_2.3.7             goftest_1.2-3          knitr_1.42            
 [79] fitdistrplus_1.1-8     zip_2.2.2              purrr_1.0.1            RANN_2.6.1             pbapply_1.7-0          future_1.30.0         
 [85] nlme_3.1-161           whisker_0.4.1          mime_0.12              compiler_4.2.2         rstudioapi_0.14        plotly_4.10.1.9000    
 [91] png_0.1-8              spatstat.utils_3.0-1   tibble_3.1.8           bslib_0.4.2            stringi_1.7.12         lattice_0.20-45       
 [97] Matrix_1.5-3           vctrs_0.5.2            pillar_1.8.1           lifecycle_1.0.3.9000   jquerylib_0.1.4        spatstat.geom_3.0-5   
[103] lmtest_0.9-40          RcppAnnoy_0.0.20       data.table_1.14.6      cowplot_1.1.1          irlba_2.3.5.1          raster_3.6-14         
[109] httpuv_1.6.8           R6_2.5.1               promises_1.2.0.1       KernSmooth_2.23-20     gridExtra_2.3          RProtoBufLib_2.10.0   
[115] parallelly_1.34.0      sessioninfo_1.2.2      codetools_0.2-18       MASS_7.3-58.2          assertthat_0.2.1       rprojroot_2.0.3       
[121] withr_2.5.0            sctransform_0.3.5      S4Vectors_0.36.1       parallel_4.2.2         terra_1.7-3            grid_4.2.2            
[127] tidyr_1.3.0            class_7.3-21           rmarkdown_2.20         Rtsne_0.16             spatstat.explore_3.0-5 Biobase_2.58.0        
[133] shiny_1.7.4            base64enc_0.1-3       
ebecht commented 1 year ago

Hi @denvercal1234GitHub

I am not familiar enough with tensorflow / keras to see where this error is coming from. Have you tested these packages independently on infinityFlow to see if they work on your machine ? Also have you made sure to disable parallelization by using cores = 1L in infinity_flow() ?

Best, Etienne

denvercal1234GitHub commented 1 year ago

Thank you @ebecht for all your help and patience with all my crazy questions.

I was finally able to fix the error and run infinity_flow() with neural net after I activate tensorflow conda environment before running R.

My laptop mac m1 has Total Number of Cores: 10 (8 performance and 2 efficiency). So I tried cores = 2L and obtained the following outputs.

Q1. Would you mind informing me what does the "L" in 2L indicate? Do you recommend using all 8 cores for infinity_flow?

Q2. Although the output still said "I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support", the pipeline appeared to run fine and GPU was in fact used (viewed in Activity Monitor). Is the output below what you also observed when you ran neural net on your data?

Screenshot 2023-03-22 at 14 09 28 Screenshot 2023-03-22 at 14 09 44 Screenshot 2023-03-22 at 14 10 00 Screenshot 2023-03-22 at 14 10 23 Screenshot 2023-03-22 at 14 10 32

Thanks again!

ebecht commented 1 year ago

Hello again,

Q1. The L indicates that the argument is of type integer rather than numeric. More core means it is going to run faster, you can use as many as your hardware supports. It should not affect the quality of the results, only the speed of computation

Q2. It's nice that you managed to use the GPU, I actually could not. Training and using the models should have been much faster then. Good for you !

denvercal1234GitHub commented 1 year ago

Thanks Etienne!