kusterlab / prosit

Prosit offers high quality MS2 predicted spectra for any organism and protease as well as iRT prediction. When using Prosit is helpful for your research, please cite "Gessulat, Schmidt et al. 2019" DOI 10.1038/s41592-019-0426-7
https://www.proteomicsdb.org/prosit/
Apache License 2.0
85 stars 45 forks source link

prosit interactive mode works while server mode fails #8

Closed tobigithub closed 5 years ago

tobigithub commented 5 years ago

Hello, the current interactive prosit version works, while the prosit server version creates multiple errors when curling the peptidelist to the server. So I think many of the errors mentioned in the initial #7 #4 are actually flask errors. Or maybe prosit makefile or prosit code issues. Or maybe related to passing flask arguments.

Proof, prosit in interactive mode creates the predictions:

root@8e29fa711bd2:~# ls -l
total 24
drwxr-xr-x 2 root root 4096 May 30 18:27 data.hdf5
-rw-r--r-- 1 root root 1012 Jun 24 06:00 jump.py
drwx------ 2 1000 1000 4096 Jun 24 05:21 model
-rw-r--r-- 1 root root 2027 Jun 24 06:00 msms_prediction.csv
-rw-r--r-- 1 root root  114 Jun 24 06:00 peptidelist.csv
drwxr-xr-x 5 root root 4096 Jun 24 06:00 prosit
root@8e29fa711bd2:~# cat msms_prediction.csv
Intensities     Masses  Matches Modified Sequence       Charge
1.0;0.41259626;0.73062277;0.30424523;0.11640168;0.093716405;0.08098759;0.1923069;0.08597337;0.15234001;0.08258821;0.012585764;0.034173176;0.013465268;0.0030653037;0.0014310256;0.008246724;0.004191969;0.00061403663;0.0013039891;0.0022206986;0.0005943045  175.118952167;322.15434716699997;435.238411167;548.322475167;619.359589167;690.3967031669999;761.4338171669999;263.088246467;360.141010467;431.178124467;502.21523846699995;573.2523524669999;218.122843817;429.74692881699997;132.047761467;180.574143467;216.092700467;251.61125746699997;207.12471403366666;230.80375203366665;254.48279003366665;229.45032313366664  y1;y2;y3;y4;y5;y6;y7;b2;b3;b4;b5;b6;y3(2+);y8(2+);b2(2+);b3(2+);b4(2+);b5(2+);y5(3+);y6(3+);y7(3+);b7(3+)  MMPAAALIM(ox)R  3
0.031838626;0.05340379;0.0008666609;0.16591543;0.6199195;1.0;0.3755796;0.006878452;0.8553704;0.18602312;0.009526462;0.1253908;0.6669085;0.056322724;0.00025479423;0.0041989647     147.112804167;294.148199167;407.232263167;504.285027167;601.3377911670001;698.3905551670001;769.4276691670001;882.5117331670001;245.131825467;316.168939467;413.221703467;301.17253381700004;349.69891581700006;385.21747281700004;158.58810796699998;207.114489967   y1;y2;y3;y4;y5;y6;y7;y8;b2;b3;b4;y5(2+);y6(2+);y7(2+);b3(2+);b4(2+)        MLAPPPIM(ox)K   2
0.6478142;0.7623719;1.0;0.78580517;0.24467614;0.037683975;0.98502636;0.46840557;0.30539873;0.5721372;0.27308717;0.025923852;0.31577614;0.6946651;0.4260823;0.0029871135;0.016204849;0.0243927;0.0019134118;0.027638009;0.008317641;0.0010335244       175.118952167;419.20711116699994;516.259875167;613.312639167;710.3654031670001;132.047761467;288.148872467;359.185986467;472.27005046700003;585.354114467;698.4381784670001;811.5222424670001;258.633575817;307.159957817;355.68633981700003;180.096631467;236.63866346700001;293.180695467;349.722727467;59.044501700333335;237.45998536700003;44.68743813366666        y1;y3;y4;y5;y6;b1;b2;b3;b4;b5;b6;b7;y4(2+);y5(2+);y6(2+);b3(2+);b4(2+);b5(2+);b6(2+);y1(3+);y6(3+);b1(3+)  MRALLLIPPPPM(ox)R       6
root@8e29fa711bd2:~#

code

# Example for manual start of prosit
# runfile for sudo make jump 
# sudo make jump MODEL=/home/xxx/prosit/prosit1

# once inside container
# cp prosit/peptidelist.csv .
# cp prosit/jump.py .
# python jump.py

from prosit import constants
from prosit import maxquant
from prosit import alignment
from prosit import prediction
from prosit import tensorize
import pandas

from prosit import model as model_lib
model_dir = constants.MODEL_DIR
global model
global model_config
model, model_config = model_lib.load(model_dir, trained=True)

# read peptides
df = pandas.read_csv("peptidelist.csv")
tensor = tensorize.peptidelist(df)
result = prediction.predict(tensor, model, model_config)
df_pred = maxquant.convert_prediction(result)

# write files to prediction.csv
# path = "{}prediction.csv".format(model_dir)
path="msms_prediction.csv"
maxquant.write(df_pred, path)

# copy file "msms_prediction.csv" out of container
# find docker ID: sudo docker ps
# sudo docker cp cbfc30fce3f1:/root/msms_prediction.csv .

when running

sudo make server MODEL=/home/xxx/prosit/prosit1

and running in another instance

curl -F "peptides=@examples/peptidelist.csv" http://127.0.0.1:5000/predict/

the following error occurs

* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
[2019-06-24 06:20:20,889] ERROR in app: Exception on /predict/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2311, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1834, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1737, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 36, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1832, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1818, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/root/prosit/server.py", line 28, in predict
    result = prediction.predict(tensor, model, model_config)
  File "/root/prosit/prediction.py", line 14, in predict
    model.compile(optimizer="adam", loss="mse")
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 333, in compile
    sample_weight, mask)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training_utils.py", line 403, in weighted
    score_array = fn(y_true, y_pred)
  File "/usr/local/lib/python3.5/dist-packages/keras/losses.py", line 14, in mean_squared_error
    return K.mean(K.square(y_pred - y_true), axis=-1)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 848, in binary_op_wrapper
    with ops.name_scope(None, op_name, [x, y]) as name:
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 5770, in __enter__
    g = _get_graph_from_inputs(self._values)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 5428, in _get_graph_from_inputs
    _assert_same_graph(original_graph_element, graph_element)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 5364, in _assert_same_graph
    original_item))
ValueError: Tensor("out_target:0", shape=(?, ?), dtype=float32) must be from the same graph as Tensor("out/Reshape:0", shape=(?, ?), dtype=float32).
172.17.0.1 - - [24/Jun/2019 06:20:20] "POST /predict/ HTTP/1.1" 500 -

There is also a keras yaml error and maybe flask can not handle it, or prosit does not contain the appropriate code to ignore this warning?

/usr/local/lib/python3.5/dist-packages/keras/engine/saving.py:349: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  config = yaml.load(yaml_string)
 * Serving Flask app "server" (lazy loading)
gessulat commented 5 years ago

Try this branch: https://github.com/kusterlab/prosit/tree/fix/server_issue

I don't quite understand why this happens for some environments and not for others.

tobigithub commented 5 years ago

@gessulat Tnx

tobigithub commented 5 years ago

@gessulat I tried the fixed server branch https://github.com/kusterlab/prosit/tree/fix/server_issue It runs and gives results, however submitting a second file will break the server.

~/prosit$ curl -F "peptides=@examples/peptidelist.csv" http://127.0.0.1:5000/predict/
Intensities     Masses  Matches Modified Sequence       Charge
1.0;0.41259626;0.73062277;0.30424523;0.11640168;0.093716405;0.08098759;0.1923069;0.08597337;0.15234001;0.08258821;0.012585764;0.034173176;0.013465268;0.0030653037;0.0014310256;0.008246724;0.004191969;0.00061403663;0.0013039891;0.0022206986;0.0005943045  175.118952167;322.15434716699997;435.238411167;548.322475167;619.359589167;690.3967031669999;761.4338171669999;263.088246467;360.141010467;431.178124467;502.21523846699995;573.2523524669999;218.122843817;429.74692881699997;132.047761467;180.574143467;216.092700467;251.61125746699997;207.12471403366666;230.80375203366665;254.48279003366665;229.45032313366664     y1;y2;y3;y4;y5;y6;y7;b2;b3;b4;b5;b6;y3(2+);y8(2+);b2(2+);b3(2+);b4(2+);b5(2+);y5(3+);y6(3+);y7(3+);b7(3+)     MMPAAALIM(ox)R  3
0.031838626;0.05340379;0.0008666609;0.16591543;0.6199195;1.0;0.3755796;0.006878452;0.8553704;0.18602312;0.009526462;0.1253908;0.6669085;0.056322724;0.00025479423;0.0041989647        147.112804167;294.148199167;407.232263167;504.285027167;601.3377911670001;698.3905551670001;769.4276691670001;882.5117331670001;245.131825467;316.168939467;413.221703467;301.17253381700004;349.69891581700006;385.21747281700004;158.58810796699998;207.114489967   y1;y2;y3;y4;y5;y6;y7;y8;b2;b3;b4;y5(2+);y6(2+);y7(2+);b3(2+);b4(2+)   MLAPPPIM(ox)K   2
0.6478142;0.7623719;1.0;0.78580517;0.24467614;0.037683975;0.98502636;0.46840557;0.30539873;0.5721372;0.27308717;0.025923852;0.31577614;0.6946651;0.4260823;0.0029871135;0.016204849;0.0243927;0.0019134118;0.027638009;0.008317641;0.0010335244       175.118952167;419.20711116699994;516.259875167;613.312639167;710.3654031670001;132.047761467;288.148872467;359.185986467;472.27005046700003;585.354114467;698.4381784670001;811.5222424670001;258.633575817;307.159957817;355.68633981700003;180.096631467;236.63866346700001;293.180695467;349.722727467;59.044501700333335;237.45998536700003;44.68743813366666   y1;y3;y4;y5;y6;b1;b2;b3;b4;b5;b6;b7;y4(2+);y5(2+);y6(2+);b3(2+);b4(2+);b5(2+);b6(2+);y1(3+);y6(3+);b1(3+)     MRALLLIPPPPM(ox)R       6

The second time to curl the same CSV file will break the server.

* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
172.17.0.1 - - [24/Jun/2019 22:59:26] "POST /predict/ HTTP/1.1" 200 -
[2019-06-24 23:03:20,541] ERROR in app: Exception on /predict/ [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 2311, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1834, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1737, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.5/dist-packages/flask/_compat.py", line 36, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1832, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.5/dist-packages/flask/app.py", line 1818, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/root/prosit/server.py", line 29, in predict
    result = prediction.predict(tensor, model, model_config, graph)
  File "/root/prosit/prediction.py", line 17, in predict
    x, verbose=verbose, batch_size=constants.PRED_BATCH_SIZE
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1167, in predict
    steps=steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2666, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2636, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1382, in __call__
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value dense_1/kernel
         [[Node: dense_1/kernel/read = Identity[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense_1/kernel)]]
         [[Node: out/Reshape/_23 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_606_out/Reshape", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
172.17.0.1 - - [24/Jun/2019 23:03:20] "POST /predict/ HTTP/1.1" 500 -

basically after one process the GPU memory is not freed. nvidia-smi shows full memory allocation, even when the GPU is not used anymore. See also https://github.com/keras-team/keras/issues/12625

|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 30%   49C    P8    17W / 250W |  10787MiB / 11178MiB |      0%      Default |
gessulat commented 5 years ago

Note that the master branch is updated now. The server should not have the issues as described above anymore.

tobigithub commented 5 years ago

@gessulat thank you, the new version works great, no hiccups!