wilhelm-lab / koina

Democratizing ML in proteomics
https://koina.wilhelmlab.org/
Apache License 2.0
25 stars 10 forks source link

Using the RESTful API from R #9

Closed tobiasko closed 1 year ago

tobiasko commented 1 year ago

Hi everyone,

I have the following issue: I am trying to predict RTs for multiple peptides using the DeepLC model. My R code is:

> ## two peptide sequences
> headers = c(
+   `Content-Type` = 'application/x-www-form-urlencoded'
+ )
> data = '{\n\t"id" : "LGGNEQVTR_GAGSSEPVTGLDAK",\n  "inputs" : [ {\n  "name" : "peptides_in_str:0",\n  "shape" : [ 1,2 ],\n  "datatype"  : "BYTES",\n  "data" : ["LGGNEQVTR","GAGSSEPVTGLDAK"]\n} ]}'
> res <- httr::POST(url = 'http://eubic2023.external.msaid.io:8501/v2/models/Deeplc_Triton_ensemble/infer', httr::add_headers(.headers=headers), body = data)
> status_code(res)
[1] 200
> str(content(res))
List of 5
 $ id           : chr "LGGNEQVTR_GAGSSEPVTGLDAK"
 $ model_name   : chr "Deeplc_Triton_ensemble"
 $ model_version: chr "1"
 $ parameters   :List of 3
  ..$ sequence_id   : int 0
  ..$ sequence_start: logi FALSE
  ..$ sequence_end  : logi FALSE
 $ outputs      :List of 1
  ..$ :List of 4
  .. ..$ name    : chr "dense_323"
  .. ..$ datatype: chr "FP32"
  .. ..$ shape   :List of 2
  .. .. ..$ : int 1
  .. .. ..$ : int 1
  .. ..$ data    :List of 1
  .. .. ..$ : num 2.56

The model returns only 1 prediction for two input sequences. Does somebody know why and how to fix this?

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] jsonlite_1.8.0 httr_1.4.2    

loaded via a namespace (and not attached):
[1] compiler_4.1.2 R6_2.5.1       tools_4.1.2    curl_4.3.2    
tkschmidt commented 1 year ago

Please change

> data = '{\n\t"id" : "LGGNEQVTR_GAGSSEPVTGLDAK",\n  "inputs" : [ {\n  "name" : "peptides_in_str:0",\n  "shape" : [ 1,2 ],\n  "datatype"  : "BYTES",\n  "data" : ["LGGNEQVTR","GAGSSEPVTGLDAK"]\n} ]}'
to > data = '{\n\t"id" : "LGGNEQVTR_GAGSSEPVTGLDAK",\n  "inputs" : [ {\n  "name" : "peptides_in_str:0",\n  "shape" : [ 2,1 ],\n  "datatype"  : "BYTES",\n  "data" : ["LGGNEQVTR","GAGSSEPVTGLDAK"]\n} ]}'

The difference was between shape: [1,2] and shape: [2,1]

tobiasko commented 1 year ago

Perfect! 🙏

> data = '{\n\t"id" : "LGGNEQVTR_GAGSSEPVTGLDAK",\n  "inputs" : [ {\n  "name" : "peptides_in_str:0",\n  "shape" : [ 2,1 ],\n  "datatype"  : "BYTES",\n  "data" : ["LGGNEQVTR","GAGSSEPVTGLDAK"]\n} ]}'
> res <- httr::POST(url = 'http://eubic2023.external.msaid.io:8501/v2/models/Deeplc_Triton_ensemble/infer', httr::add_headers(.headers=headers), body = data)
> status_code(res)
[1] 200
> str(content(res))
List of 5
 $ id           : chr "LGGNEQVTR_GAGSSEPVTGLDAK"
 $ model_name   : chr "Deeplc_Triton_ensemble"
 $ model_version: chr "1"
 $ parameters   :List of 3
  ..$ sequence_id   : int 0
  ..$ sequence_start: logi FALSE
  ..$ sequence_end  : logi FALSE
 $ outputs      :List of 1
  ..$ :List of 4
  .. ..$ name    : chr "dense_323"
  .. ..$ datatype: chr "FP32"
  .. ..$ shape   :List of 2
  .. .. ..$ : int 2
  .. .. ..$ : int 1
  .. ..$ data    :List of 2
  .. .. ..$ : num 2.56
  .. .. ..$ : num 3.5