kserve / modelmesh-serving

Controller for ModelMesh
Apache License 2.0
204 stars 114 forks source link

Triton Inference Server doesn't return any values when using gRPC protocol but RESTful does #449

Closed anhquan075 closed 10 months ago

anhquan075 commented 1 year ago

Describe the bug Triton Inference Server doesn't return any values when using the gRPC protocol. This issue is specifically observed in gRPC requests; however, the Restful protocol works as expected. Additionally, the gRPC protocol of mlserver also returns values as expected.

To Reproduce Steps to reproduce the behavior:

  1. Create a isvc use Triton Inference Server ServingRuntime with gRPC protocol.
  2. Send an inference request using gRPC.
  3. Observe that there are no output values returned.

Expected Behavior When making inference requests using the gRPC protocol, Triton Inference Server should return the expected values, similar to how it behaves with Restful protocol and the gRPC protocol of mlserver.

Actual Behavior The server does not return any values for inference requests made using the gRPC protocol with Triton Inference Server.

Screenshots My curl example with RESTful protocol with Triton Inference Server

curl --location 'localhost:8008/v2/models/model-7b8c7d46-b3a2-42b3-abca-cf84e4aced91/infer' \
--header 'Content-Type: application/json' \
--data '{
    "inputs": [
        {
            "name": "input0",
            "shape": [
                1,
                32,
                12
            ],
            "datatype": "FP32",
            "data": [-0.7176903832456532, -1.0157740209135122, 0.4194852223916975, 0.8752199851715384, -0.014565039600956918, -0.22149571677475563, -0.8903942835116693, -0.7679129122620975, 1.1892652648300432, 0.8860959894104754, -0.1999654163564351, -1.5724296724174216, 0.6094014392028905, 1.3833165923284705, 0.6108914182443717, -1.123827061691862, 0.0544211445960993, 0.11010764791350473, -1.8624705368039545, 0.6765224931345186, -0.8641030795960931, 0.3987866100262092, 0.47637917605103386, -0.4166939283537705, 0.010519441064237915, 1.1962840653758564, 0.0783669330819023, 0.9653746650570291, -2.3122638958519497, -0.3248970977225927, -0.9667312555186839, -0.4171928442215796, 0.9484617840376967, -0.46536687418861933, 1.4608614463690268, -0.8689846574093149, -0.23482061093885986, 0.10969287502658806, 0.5339535604581099, 1.4828620884088286, 0.3498742856561679, 1.2700771236477135, 1.165910137224342, 0.7578825235289999, -0.20972046893900623, 0.4697084664601432, 0.6316780825984691, 0.5748990529621442, 1.3552843492754065, 0.17649875564667988, 1.5825062643581809, 1.0516928391161409, -0.7417114986519779, 0.9189731940059582, -0.18106619806442628, 0.1520078104773814, -1.177581143021401, 0.5334755441180627, 0.34740010694598544, 0.14517907966771113, -0.7937972565680115, -0.6870856647499359, 0.3706250511002557, 0.24060835190790075, 0.04207969268560036, 1.2719917204002829, 1.4911417195652914, -0.8364720853586, 0.6958477731703606, -0.12567495517400068, 0.6053143057966077, -1.8864126976150832, 0.5710897548076836, 0.29729158785258947, 1.437026554161514, 0.5704858222766066, 0.07859655356831197, 0.11199096104790122, -1.4433855673187292, -1.2121764752260917, 0.06712121478649533, -1.4559056220493702, -1.573413663499016, 1.7952771042032931, 2.619491663513298, 0.858590096770223, 2.008816114776211, -0.8142107698107222, -1.0582195773561263, -0.013454526777534063, 1.449704181682262, 1.8405172568258261, 0.3472804136846818, -0.4995810938935084, 0.12227142155976399, -0.6940453433221152, -1.0618672848113824, 0.7033622587293908, 0.7852737750914629, 2.2990021444618587, -0.9401208505107684, -0.7949222722780278, 1.4202499376808158, 1.1937674754285228, 0.3128399225739123, 1.5200006160123112, -0.6009295125626132, 0.12580101672841953, -2.2730035740883974, 0.3334796750634016, -0.3684004330428562, 0.27092592607818017, 0.6992883604185364, 0.6402573185441105, 0.6843086486182087, 2.0468301146660712, 0.6156746190788044, 1.6452886474430544, -1.3303845060805939, 0.3450149195622283, 1.3346784963984253, 0.720133596072146, 0.6729081329739532, 0.24700647583265628, 1.2743394988515524, 1.9288560374087198, 1.1204921178850658, 0.05556212742529328, 1.4273545098814207, -0.2320528229717757, 0.3735908948819337, 0.7332162783185312, -0.42094358145358024, 2.0952578738116823, -1.310536017776799, 0.12366966111533781, -0.9399062297286871, 0.8338910525993335, 1.180994228319032, 0.20370875114302944, 0.443282493786549, 0.4036392518848064, -0.9190233075973175, 0.4020652828645791, 1.0843999496088683, -0.20504474516108778, 1.321644507924335, -1.9751077500456085, -1.6197418223199511, 0.7592056709740629, -0.4395107816071784, -0.2085524294782932, 0.06068781725293597, 0.18320242598681907, -0.7080002105533353, 1.2183141175488577, -1.5428879128783886, 0.4244214278154205, 0.03715660989032172, 3.038454190413897, 0.1938511019378095, 0.10871614736079181, 1.378256235329479, 0.9175887442705597, -0.018907889926360024, -2.582180172449743, 1.5225823286976974, -0.9737548891087339, 1.573420600588381, 1.2473481158808362, 0.9617480972264908, 0.3764007917214983, 1.5476485071005257, 0.3480013978966639, 1.2948751855800902, -0.48827360261334896, -1.0486867968683096, 0.9535892308809825, 0.36373011849667103, -0.4791154231626282, 0.17238444527619554, -0.17189632806205793, -0.19802384257316213, 0.1515872473562646, 1.3910516007115754, -1.970036688311045, 1.680330848333716, 0.2519599488877222, -0.023425752828022787, -1.77617790865702, 0.11042710246710764, -0.3526810370070084, 0.02495680689263148, 2.51162631682882, 1.3911775874485437, -0.519469743927326, -1.359383098439164, -0.8052476557592473, -0.6143800969918121, -0.28807780512833664, 0.48913121507576995, 0.03406527172806874, -0.03477053519822078, 0.304810368378932, 1.2222346334178038, -0.4045110136697703, -1.5700835436911817, -1.895251114354789, -0.15887805270611557, -3.2211147523867165, 0.6899538116609479, -1.3072940479307618, -0.38177676756939083, 0.19203153423124997, -0.7565620156710247, -0.22499250319782674, 0.4784685471761122, -0.47523699543859194, 2.229446363511042, 1.9307877116651966, -0.639128157554252, 0.02410333155162819, 0.11610919429594367, 0.20874493532497787, 0.3422095981686792, -0.17235353225975497, -0.007179098286764847, -0.7365585009385849, -0.1860206664518323, 1.6564011329818393, 0.14126838998752494, -0.024106957253960472, 1.4834238092827245, 1.3200437205607771, -1.2936816776071656, -1.5786980231707768, -0.7485326891587313, -1.5343527553786462, 0.17836438447041153, 0.6281377170947245, 0.7940528344872515, -0.384539118486145, 1.915237279581315, 0.5232344049523302, 0.44870594387793833, -0.8199068471276318, -1.424898595259054, -0.2580133980328819, -0.4910033102344325, 0.9254207335305745, -1.3878577045128702, 0.9120240539476625, 1.1617631461992304, -1.0413263962390438, -0.5206066742884088, 2.0657455723464886, 0.04534634587421682, -0.6007723326062216, 0.18800891447884419, 1.4886888525930244, 1.2493521915845047, 0.48977869234450044, -0.44479330738075107, -0.14298236438407852, -0.4842646557624694, 1.438229312045533, -1.0404715883546884, 0.568151432659012, 2.2250362886443105, 0.008463281820969126, 0.7348262423750134, 0.1027684681815915, -1.4047203806673483, 0.034630004449068934, -0.5100246770719928, 0.24422893886151611, 0.004879062150609986, -2.0131879722552695, -1.4051662191482819, 0.48342736469285075, -0.5721863621949334, -0.5855943483590188, 0.688669571299865, -1.5647566576123686, -0.2886686329506002, 0.11891395338112762, -0.16300241913295874, 0.8820645346618062, 1.3380715312115221, 0.1364567993761385, -0.35372629688461615, 0.022721904274153194, -0.81416877822691, -0.42686735948478355, 0.0839573531010532, 0.0796677235550361, -0.5146320764147442, 2.1943963350263322, 1.026660810177237, 1.1761360889963441, -0.2731262884001127, 0.9518104877609802, 2.0171188677523277, 1.2345553071712583, -0.21820134240141864, 0.016102381653818228, 0.8405109111126311, 1.704041125808497, -2.4470810633278552, 0.2056039217128093, 0.1146413488571081, 0.5761920553693763, -1.000225577269603, -0.8350669519790883, -0.7432412060115644, -0.2492660933516803, 0.8859392890954946, -0.7414505896845763, -0.4338461093246398, -2.367918119893251, -1.6240409117058177, 0.8496537688702788, -0.4113440789225947, -1.0893987494908102, 1.762165103110704, -1.138330016111171, -1.3490587426113598, -1.3409776707738685, 1.0809915045677825, 0.28731854451755795, 0.20237029062144338, 0.1855454151638621, -1.2503405708196789, -0.3938881758598866, -0.8800242097004197, -1.4284988157403833, -1.6395358654722139, -1.0920369735180486, -0.30446133746153486, -0.3191925266293908, 0.16808640683903553, 1.0168735174795254, 2.0959208486451937, 0.5666604705570054, 0.3264539450401426, -0.7219109578957356, -1.152314776921426, 2.193359964824078, 0.35685122910571393, 0.8053081910781844, -0.14881933174992798, 1.0039254844521461, -0.02364746320525155, -0.25342997760321523, -0.14363036074068022, -0.3043896453916708, -0.9439308865702818, 1.0186393847811968, -0.6219692087880438, -2.0576415677354634, -0.01365188014660091, -0.6965186481991746, -1.5876789891899552, -0.47590079713747735, 0.23185419362463877, -0.8429839186052104, 0.4652068444257226, -0.6409580653637347, -0.1110619994803934, -0.07885186319918261, 0.19888929142793538, -1.3009011312092154, 0.7987809219629829, 1.4721533697961977, 0.35333483657175163, -0.30032851045948733, -0.2341169746737359, -0.926327691261229, 0.5864856604647578, 1.553506944402208, 0.47364197533675984, 0.9215136376447445, 0.9893433674165261, 0.38266957195874685]
        }
    ]
}'
image

Meanwhile, my grpcurl for the gRPC protocol with Triton Inference Server:

grpcurl \
  -plaintext \
  -proto fvt/proto/kfs_inference_v2.proto \
  -d '{
  "model_name": "model-7b8c7d46-b3a2-42b3-abca-cf84e4aced91",
  "inputs": [
      {
    "name": "input0",
    "shape": [
      1,
      32,
      12
    ],
    "datatype": "FP32",
    "contents": { "fp32_contents": [-0.7176903832456532, -1.0157740209135122, 0.4194852223916975, 0.8752199851715384, -0.014565039600956918, -0.22149571677475563, -0.8903942835116693, -0.7679129122620975, 1.1892652648300432, 0.8860959894104754, -0.1999654163564351, -1.5724296724174216, 0.6094014392028905, 1.3833165923284705, 0.6108914182443717, -1.123827061691862, 0.0544211445960993, 0.11010764791350473, -1.8624705368039545, 0.6765224931345186, -0.8641030795960931, 0.3987866100262092, 0.47637917605103386, -0.4166939283537705, 0.010519441064237915, 1.1962840653758564, 0.0783669330819023, 0.9653746650570291, -2.3122638958519497, -0.3248970977225927, -0.9667312555186839, -0.4171928442215796, 0.9484617840376967, -0.46536687418861933, 1.4608614463690268, -0.8689846574093149, -0.23482061093885986, 0.10969287502658806, 0.5339535604581099, 1.4828620884088286, 0.3498742856561679, 1.2700771236477135, 1.165910137224342, 0.7578825235289999, -0.20972046893900623, 0.4697084664601432, 0.6316780825984691, 0.5748990529621442, 1.3552843492754065, 0.17649875564667988, 1.5825062643581809, 1.0516928391161409, -0.7417114986519779, 0.9189731940059582, -0.18106619806442628, 0.1520078104773814, -1.177581143021401, 0.5334755441180627, 0.34740010694598544, 0.14517907966771113, -0.7937972565680115, -0.6870856647499359, 0.3706250511002557, 0.24060835190790075, 0.04207969268560036, 1.2719917204002829, 1.4911417195652914, -0.8364720853586, 0.6958477731703606, -0.12567495517400068, 0.6053143057966077, -1.8864126976150832, 0.5710897548076836, 0.29729158785258947, 1.437026554161514, 0.5704858222766066, 0.07859655356831197, 0.11199096104790122, -1.4433855673187292, -1.2121764752260917, 0.06712121478649533, -1.4559056220493702, -1.573413663499016, 1.7952771042032931, 2.619491663513298, 0.858590096770223, 2.008816114776211, -0.8142107698107222, -1.0582195773561263, -0.013454526777534063, 1.449704181682262, 1.8405172568258261, 0.3472804136846818, -0.4995810938935084, 0.12227142155976399, -0.6940453433221152, -1.0618672848113824, 0.7033622587293908, 0.7852737750914629, 2.2990021444618587, -0.9401208505107684, -0.7949222722780278, 1.4202499376808158, 1.1937674754285228, 0.3128399225739123, 1.5200006160123112, -0.6009295125626132, 0.12580101672841953, -2.2730035740883974, 0.3334796750634016, -0.3684004330428562, 0.27092592607818017, 0.6992883604185364, 0.6402573185441105, 0.6843086486182087, 2.0468301146660712, 0.6156746190788044, 1.6452886474430544, -1.3303845060805939, 0.3450149195622283, 1.3346784963984253, 0.720133596072146, 0.6729081329739532, 0.24700647583265628, 1.2743394988515524, 1.9288560374087198, 1.1204921178850658, 0.05556212742529328, 1.4273545098814207, -0.2320528229717757, 0.3735908948819337, 0.7332162783185312, -0.42094358145358024, 2.0952578738116823, -1.310536017776799, 0.12366966111533781, -0.9399062297286871, 0.8338910525993335, 1.180994228319032, 0.20370875114302944, 0.443282493786549, 0.4036392518848064, -0.9190233075973175, 0.4020652828645791, 1.0843999496088683, -0.20504474516108778, 1.321644507924335, -1.9751077500456085, -1.6197418223199511, 0.7592056709740629, -0.4395107816071784, -0.2085524294782932, 0.06068781725293597, 0.18320242598681907, -0.7080002105533353, 1.2183141175488577, -1.5428879128783886, 0.4244214278154205, 0.03715660989032172, 3.038454190413897, 0.1938511019378095, 0.10871614736079181, 1.378256235329479, 0.9175887442705597, -0.018907889926360024, -2.582180172449743, 1.5225823286976974, -0.9737548891087339, 1.573420600588381, 1.2473481158808362, 0.9617480972264908, 0.3764007917214983, 1.5476485071005257, 0.3480013978966639, 1.2948751855800902, -0.48827360261334896, -1.0486867968683096, 0.9535892308809825, 0.36373011849667103, -0.4791154231626282, 0.17238444527619554, -0.17189632806205793, -0.19802384257316213, 0.1515872473562646, 1.3910516007115754, -1.970036688311045, 1.680330848333716, 0.2519599488877222, -0.023425752828022787, -1.77617790865702, 0.11042710246710764, -0.3526810370070084, 0.02495680689263148, 2.51162631682882, 1.3911775874485437, -0.519469743927326, -1.359383098439164, -0.8052476557592473, -0.6143800969918121, -0.28807780512833664, 0.48913121507576995, 0.03406527172806874, -0.03477053519822078, 0.304810368378932, 1.2222346334178038, -0.4045110136697703, -1.5700835436911817, -1.895251114354789, -0.15887805270611557, -3.2211147523867165, 0.6899538116609479, -1.3072940479307618, -0.38177676756939083, 0.19203153423124997, -0.7565620156710247, -0.22499250319782674, 0.4784685471761122, -0.47523699543859194, 2.229446363511042, 1.9307877116651966, -0.639128157554252, 0.02410333155162819, 0.11610919429594367, 0.20874493532497787, 0.3422095981686792, -0.17235353225975497, -0.007179098286764847, -0.7365585009385849, -0.1860206664518323, 1.6564011329818393, 0.14126838998752494, -0.024106957253960472, 1.4834238092827245, 1.3200437205607771, -1.2936816776071656, -1.5786980231707768, -0.7485326891587313, -1.5343527553786462, 0.17836438447041153, 0.6281377170947245, 0.7940528344872515, -0.384539118486145, 1.915237279581315, 0.5232344049523302, 0.44870594387793833, -0.8199068471276318, -1.424898595259054, -0.2580133980328819, -0.4910033102344325, 0.9254207335305745, -1.3878577045128702, 0.9120240539476625, 1.1617631461992304, -1.0413263962390438, -0.5206066742884088, 2.0657455723464886, 0.04534634587421682, -0.6007723326062216, 0.18800891447884419, 1.4886888525930244, 1.2493521915845047, 0.48977869234450044, -0.44479330738075107, -0.14298236438407852, -0.4842646557624694, 1.438229312045533, -1.0404715883546884, 0.568151432659012, 2.2250362886443105, 0.008463281820969126, 0.7348262423750134, 0.1027684681815915, -1.4047203806673483, 0.034630004449068934, -0.5100246770719928, 0.24422893886151611, 0.004879062150609986, -2.0131879722552695, -1.4051662191482819, 0.48342736469285075, -0.5721863621949334, -0.5855943483590188, 0.688669571299865, -1.5647566576123686, -0.2886686329506002, 0.11891395338112762, -0.16300241913295874, 0.8820645346618062, 1.3380715312115221, 0.1364567993761385, -0.35372629688461615, 0.022721904274153194, -0.81416877822691, -0.42686735948478355, 0.0839573531010532, 0.0796677235550361, -0.5146320764147442, 2.1943963350263322, 1.026660810177237, 1.1761360889963441, -0.2731262884001127, 0.9518104877609802, 2.0171188677523277, 1.2345553071712583, -0.21820134240141864, 0.016102381653818228, 0.8405109111126311, 1.704041125808497, -2.4470810633278552, 0.2056039217128093, 0.1146413488571081, 0.5761920553693763, -1.000225577269603, -0.8350669519790883, -0.7432412060115644, -0.2492660933516803, 0.8859392890954946, -0.7414505896845763, -0.4338461093246398, -2.367918119893251, -1.6240409117058177, 0.8496537688702788, -0.4113440789225947, -1.0893987494908102, 1.762165103110704, -1.138330016111171, -1.3490587426113598, -1.3409776707738685, 1.0809915045677825, 0.28731854451755795, 0.20237029062144338, 0.1855454151638621, -1.2503405708196789, -0.3938881758598866, -0.8800242097004197, -1.4284988157403833, -1.6395358654722139, -1.0920369735180486, -0.30446133746153486, -0.3191925266293908, 0.16808640683903553, 1.0168735174795254, 2.0959208486451937, 0.5666604705570054, 0.3264539450401426, -0.7219109578957356, -1.152314776921426, 2.193359964824078, 0.35685122910571393, 0.8053081910781844, -0.14881933174992798, 1.0039254844521461, -0.02364746320525155, -0.25342997760321523, -0.14363036074068022, -0.3043896453916708, -0.9439308865702818, 1.0186393847811968, -0.6219692087880438, -2.0576415677354634, -0.01365188014660091, -0.6965186481991746, -1.5876789891899552, -0.47590079713747735, 0.23185419362463877, -0.8429839186052104, 0.4652068444257226, -0.6409580653637347, -0.1110619994803934, -0.07885186319918261, 0.19888929142793538, -1.3009011312092154, 0.7987809219629829, 1.4721533697961977, 0.35333483657175163, -0.30032851045948733, -0.2341169746737359, -0.926327691261229, 0.5864856604647578, 1.553506944402208, 0.47364197533675984, 0.9215136376447445, 0.9893433674165261, 0.38266957195874685]
  }}
  ]
}' \
  localhost:8033 \
  inference.GRPCInferenceService.ModelInfer
image

When I run a model using mlserver runtimes via gRPC, I receive the desired outputs as intended.

image

Environment (please complete the following information):

tjohnson31415 commented 1 year ago

Hi @anhquan075. Thanks for creating a detailed issue report with all relevant information and screenshots to boot!

The Triton Inference server's gRPC interface returns the output tensor as raw bytes in the raw_output_contents field for performance reason (I tried to find a good doc page describing this, but only found this issue comment). For each output tensor in outputs there will be an entry in raw_output_contents with a base64 encoded string of the bytes of the raw data for the tensor. The outputs metadata tells you the shape and datatype that you then need to parse from the bytes. For your example, the output is 12 32bit floats that will need to be parsed out.

Here's an example in Go from our FVTs where we do this parsing for our test request: https://github.com/kserve/modelmesh-serving/blob/b5affffb642a7b65877e50923b29db24f09f8265/fvt/inference.go#L374-L380

The Triton Inference Server Client could also be used to do this output post-processing.

Let me know if this helps or if it doesn't resolve your issue!

anhquan075 commented 1 year ago

@tjohnson31415 thank you for the response. I will try to parse the bytes content to get the output. Btw, I also found a way to fix it, by using the grpclient in tritonclient python sdk.