griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
131 stars 58 forks source link

pVACtools not working with newest tensorflow (2.16.1) #1084

Closed Stikus closed 4 months ago

Stikus commented 4 months ago

Installation Type

Standalone

pVACtools Version / Docker Image

4.1.1

Python Version

3.10

Operating System

Ubuntu 22.04

Describe the bug

Due to limited pinning of tensorflow version after recent update to 2.16.1 some legacy code stopped working - maybe it is related to changes, described here https://github.com/tensorflow/tensorflow/releases/tag/v2.16.1.

If we pin tensorflow version to 2.15.1 all started working again.

How to reproduce this bug

python3 /usr/local/bin/pvacseq run -t 32 -e1 9,10 -e2 15 --pass-only  --iedb-install-directory /soft/IEDB /input/input.vcf TUMOR "HLA-A*02:01,HLA-B*35:01,DRB1*11:01" MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign /output

Input files

No response

Log output

An exception occured in thread 1: (<class 'Exception'>, An error occurred while calling MHCnuggets:
/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/masking.py:47: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_mhcnuggets.py", line 84, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_mhcnuggets.py", line 61, in main
    predict(args.class_type, tmp_file.name, mhcnuggets_allele(args.allele, args.class_type), output=tmp_output_file.name, rank_output=True)
  File "/usr/local/lib/python3.10/dist-packages/mhcnuggets/src/predict.py", line 93, in predict
    model.compile(loss='mse', optimizer=Adam(lr=0.001))
  File "/usr/local/lib/python3.10/dist-packages/keras/src/optimizers/adam.py", line 60, in __init__
    super().__init__(
  File "/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/optimizer.py", line 19, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/keras/src/optimizers/base_optimizer.py", line 38, in __init__
    raise ValueError(f"Argument(s) not recognized: {kwargs}")
ValueError: Argument(s) not recognized: {'lr': 0.001}
).
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/prediction_class.py", line 513, in predict
    response = run(arguments, check=True, stdout=DEVNULL, stderr=stderr_fh)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mhcflurry-predict', '--alleles', 'HLA-A*02:01', '--out', '/output/MHC_Class_I/tmp/tmpjje516d8', '--peptides', 'EPPRPPQQP', 'QGSGEKAGC', 'PLELAYCLQ', 'LVRSRTYDM', 'DYLSDRCKI', 'RSAGQHWAR', 'AGNLFNCEC', 'TQEVKVKEP', 'KAVRPLELV', 'GSESRVEPP', 'AAAAAVIPT', 'SPVKEEEKP', 'SAGQHWARL', 'YLSDRCKIL', 'KAGCPWSGT', 'PARPPQQPV', 'HSRTYDMDV', 'AAAAAAVIP', 'ALNHLLTEE', 'DAVQGIANQ', 'VSPEPPRPP', 'ERTEASGYE', 'IANEDAAQG', 'SHVWTRSRD', 'PVKEEEKPQ', 'CDDMDCLSD', 'VTISCTGSS', 'VSPEPARPP', 'AMSHFEPNE', 'DMDYLSDRC', 'TVVVTTQKR', 'PGSSPTTVI', 'SYGLLHIYG', 'PPLLPLLPL', 'PCDDMDCLS', 'LAFTRLTSE', 'HLLTEEEDY', 'YGLLHTYGS', 'NEDAAQGIA', 'KTTVVVTTQ', 'VRPLELAYC', 'LFNCECDLF', 'YSYGLLHTY', 'YLSVRGGFN', 'GNLFNCECD', 'GRSAGQHWA', 'TLLAAAGGS', 'TEASGYESR', 'CLSDRCKIL', 'NEALNHLLT', 'LLLLLLGAS', 'AKTTVVVTA', 'PPPPLLPLL', 'NMSSFKLKE', 'RSAGSTGQG', 'GQGSGEKAG', 'ELAGNPFNC', 'GSHVWTRSR', 'AVQGIANED', 'RPGSAPTTV', 'CELAGNPFN', 'ELAGNLFNC', 'CELAGNLFN', 'VSLFGALVR', 'IYGSGGYAL', 'AVIPTVSTP', 'EDAAQGIAK', 'QRPGSAPTT', 'EALNNLLTE', 'KEEEKTQEV', 'EASGSESRV', 'WEFLASTRL', 'TLSRTLLLA', 'EFLASTRLT', 'PPPLLPLLP', 'SHVWTHSRD', 'LELVYCLQK', 'LHTVSPEPP', 'ALNNLLTEE', 'FLAFTRLTS', 'STGQGSGEK', 'DCLSDRCKI', 'VVVTTQKRN', 'LNHLLTEEE', 'YYSYGLLHI', 'QGIANQDAA', 'PGSAPTTVI', 'LPLLPLLLL', 'LCQRAKVEM', 'WYQQRPGSA', 'EKTQEVKVK', 'TRSRDPEGS', 'LSRTLLLAA', 'KTVTISCTG', 'YQQRPGSSP', 'PPLLPLLLL', 'LLLAAAGGS', 'GSTGQGSGE', 'QRPGSSPTT', 'VQGIANQDA', 'LGWEFLAFT', 'SGEKAGCPW', 'VTTQKRNSR', 'PFNCECDLF', 'EMSHFEPNE', 'PQEVKVKEP', 'EFLAFTRLT', 'TEASGSESR', 'TISCTGSSG', 'ANEDAAQGI', 'NLFNCECDL', 'CQRAKVEMS', 'AKVAMSHFE', 'KSVNEALNH', 'LLPLLLLLL', 'VAMSHFEPN', 'VLCQRAKVE', 'APTTVIYED', 'ASGYESRVE', 'WYQQRPGSS', 'IGKHGGVSL', 'NPQTDYLTG', 'TYGSGGYAL', 'EEKTQEVKV', 'SRTLLAAAG', 'SPTTVIYED', 'ERTEASGSE', 'VYCLQKCNV', 'PEPARPPQQ', 'GALVRSRTY', 'SAAAAAAAV', 'QQRPGSAPT', 'EEEKTQEVK', 'TLKFNPETD', 'GSGEKAGCP', 'FNPQTDYLT', 'HHEDLIGKP', 'KPQEVKVKE', 'HEDLIGKHG', 'CTGSSGSIA', 'ELAYCLQKC', 'ATLSRTLLL', 'LLHTYGSGG', 'LSVRGGFNM', 'HTYGSGGYA', 'KVAMSHFEP', 'RTEASGSES', 'VNEALNHLL', 'YSYGLLHIY', 'VEMSHFEPN', 'FLASTRLTS', 'RPGSSPTTV', 'RSRTYDMDV', 'TTVVVTTQK', 'AFTRLTSEL', 'AGTLKFNPE', 'RPLELVYCL', 'ANQDAAQGI', 'KEEEKPQEV', 'TLSRTLLAA', 'TLLLAAAGG', 'VWTHSRDPE', 'SAGSTGQGS', 'MDCLSDRCK', 'LHTYGSGGY', 'HTVSPEPAR', 'VLCQRAKVA', 'TGQGSGEKA', 'LIGKHGGVS', 'TGSSGSIAS', 'TVSPEPPRP', 'TVTISCTGS', 'TVVVTAQKR', 'SSPTTVIYE', 'SVRGGFNMS', 'LPLLLLLGA', 'PLLPLLPLL', 'QPCDDMDCL', 'SAPTTVIYE', 'KHGGVSLSK', 'QPCDDMDYL', 'SESRVEPPH', 'HEDLIGKPG', 'TTQKRNSRR', 'PEPPRPPQQ', 'KSVNEALNN', 'ASGSESRVE', 'PLLLLLLGA', 'GFNMSSFKL', 'LPLLLLLLG', 'QGIANEDAA', 'GGSHVWTHS', 'LAGNPFNCE', 'SPVKEEEKT', 'PVKEEEKTQ', 'IGKPGGVSL', 'GLLHIYGSG', 'GSSGSIASN', 'RAKVAMSHF', 'EEKPQEVKV', 'AYCLQKCNV', 'FNPETDYLT', 'MVCELAGNP', 'NEALNNLLT', 'ISCTRSSGS', 'VRSRTYDMD', 'DDMDCLSDR', 'YYSYGLLHT', 'VKEEEKPQE', 'RTLLAAAGG', 'LHTVSPEPA', 'GLLHTYGSG', 'MDYLSDRCK', 'VKEEEKTQE', 'GWEFLAFTR', 'TVTISCTRS', 'PPPLLPLLL', 'LHIYGSGGY', 'EDLIGKPGG', 'GSSPTTVIY', 'LAGNLFNCE', 'PLELVYCLQ', 'VRPLELVYC', 'STRLTSELN', 'GGGGRSAGQ', 'LLPLLLLLG', 'TLKFNPQTD', 'GGRSAGSTG', 'ATLSRTLLA', 'SGSESRVEP', 'SLFGALVRS', 'NMSSFKLKQ', 'AGCPWSGTG', 'GGSHVWTRS', 'RPLELAYCL', 'TRSSGSIAS', 'EERTEASGY', 'AKTTVVVTT', 'AAAVIPTVS', 'LFGALVHSR', 'HSRDPEGSS', 'EALNHLLTE', 'WEFLAFTRL', 'PQTDYLTGT', 'LCQRAKVAM', 'LLAAAGGSS', 'GGRSAGQHW', 'GALVHSRTY', 'CQRAKVAMS', 'SAAAAAAAA', 'GWEFLASTR', 'PETDYLTGT', 'LPPPPLLPL', 'NHLLTEEED', 'PRPPQQPVP', 'KFNPETDYL', 'YQPCDDMDC', 'HVWTRSRDP', 'VTAQKRNSR', 'NLLTEEEDY', 'RTEASGYES', 'AGNPFNCEC', 'QRAKVAMSH', 'YQQRPGSAP', 'CTRSSGSIA', 'GYESRVEPP', 'SCTGSSGSI', 'CDDMDYLSD', 'ALVRSRTYD', 'GTLKFNPET', 'KVEMSHFEP', 'QRAKVEMSH', 'EDLIGKHGG', 'EKPQEVKVK', 'AAVIPTVST', 'VNEALNNLL', 'KPGGVSLSK', 'AGTLKFNPQ', 'GGGGRSAGS', 'ASAAAAAAA', 'AGQHWARLR', 'VSLFGALVH', 'GGGRSAGQH', 'GGFNMSSFK', 'VCELAGNLF', 'LGWEFLAST', 'GKHGGVSLS', 'PPRPPQQPV', 'GSAPTTVIY', 'IANQDAAQG', 'MVCELAGNL', 'VVTTQKRNS', 'GEKAGCPWS', 'YGLLHIYGS', 'RAKVEMSHF', 'LIGKPGGVS', 'LKFNPETDY', 'SPEPPRPPQ', 'VTISCTRSS', 'VRGGFNMSS', 'AVRPLELAY', 'GCPWSGTGQ', 'NNLLTEEED', 'ASTRLTSEL', 'FTRLTSELN', 'ISCTGSSGS', 'GNPFNCECD', 'LAYCLQKCN', 'ELVYCLQKC', 'GTLKFNPQT', 'EPARPPQQP', 'GIANEDAAQ', 'LNNLLTEEE', 'SGYESRVEP', 'LFGALVRSR', 'KTTVVVTAQ', 'GIANQDAAQ', 'VQGIANEDA', 'AAAAAAAAA', 'CPWSGTGQH', 'SGGSHVWTH', 'KFNPQTDYL', 'LLHIYGSGG', 'LSRTLLAAA', 'TAQKRNSRR', 'HVWTHSRDP', 'ALVHSRTYD', 'NPETDYLTG', 'SVNEALNHL', 'HIYGSGGYA', 'SYGLLHTYG', 'FGALVHSRT', 'THSRDPEGS', 'QTDYLTGTD', 'GKPGGVSLS', 'RTLLLAAAG', 'LVYCLQKCN', 'AVQGIANQD', 'HGGVSLSKI', 'ALGWEFLAF', 'AAAAVIPTV', 'NPFNCECDL', 'NQDAAQGIA', 'VVTAQKRNS', 'SVNEALNNL', 'PLLPLLLLL', 'VHSRTYDMD', 'DMDCLSDRC', 'AKVEMSHFE', 'EERTEASGS', 'EEEKPQEVK', 'AAAAAAAVI', 'HHEDLIGKH', 'EASGYESRV', 'PGGVSLSKI', 'ALGWEFLAS', 'FNMSSFKLK', 'SGGSHVWTR', 'TVSPEPARP', 'YQPCDDMDY', 'VWTRSRDPE', 'GSHVWTHSR', 'SLFGALVHS', 'DLIGKHGGV', 'VCELAGNPF', 'KTVTISCTR', 'KTQEVKVKE', 'WTRSRDPEG', 'YESRVEPPH', 'AVRPLELVY', 'KAVRPLELA', 'PLLLLLGAS', 'RSRDPEGSS', 'EKAGCPWSG', 'ETDYLTGTD', 'TISCTRSSG', 'RSSGSIASN', 'FGALVRSRT', 'HTVSPEPPR', 'ARPPQQPVP', 'TGGGGRSAG', 'SRTLLLAAA', 'TQKRNSRRQ', 'RGGFNMSSF', 'VVVTAQKRN', 'LLPLLPLLL', 'PCDDMDYLS', 'SPEPARPPQ', 'AQKRNSRRQ', 'SCTRSSGSI', 'QDAAQGIAK', 'LKFNPQTDY', 'GRSAGSTGQ', 'WTHSRDPEG', 'DAVQGIANE', 'DLIGKPGGV', 'TTVVVTAQK', 'LVHSRTYDM', 'DDMDYLSDR', 'GGGRSAGST', 'SATLSRTLL', 'QQRPGSSPT', 'AGSTGQGSG', 'LASTRLTSE', 'AAAAAAAAV', 'LELAYCLQK']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/pipeline.py", line 357, in call_iedb
    pvactools.lib.call_iedb.main(arguments)
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_iedb.py", line 46, in main
    raise err
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_iedb.py", line 41, in main
    (response_text, output_mode) = prediction_class_object.predict(args.input_file, args.allele, args.epitope_length, args.iedb_executable_path, args.iedb_retries, tmp_dir=args.tmp_dir, log_dir=args.log_dir)
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/prediction_class.py", line 519, in predict
    raise Exception("An error occurred while calling MHCflurry:\n{}".format(err))
Exception: An error occurred while calling MHCflurry:
Traceback (most recent call last):
  File "/usr/local/bin/mhcflurry-predict", line 8, in <module>
    sys.exit(run())
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/predict_command.py", line 204, in run
    predictor = Class1PresentationPredictor.load(models_dir)
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/class1_presentation_predictor.py", line 956, in load
    affinity_predictor = Class1AffinityPredictor.load(
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/class1_affinity_predictor.py", line 608, in load
    optimized = result.optimize()
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/class1_affinity_predictor.py", line 653, in optimize
    Class1NeuralNetwork.merge(
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/class1_neural_network.py", line 1148, in merge
    configure_tensorflow()
  File "/usr/local/lib/python3.10/dist-packages/mhcflurry/common.py", line 131, in configure_tensorflow
    tensorflow.compat.v1.keras.backend.set_session(session)
AttributeError: module 'keras._tf_keras.keras.backend' has no attribute 'set_session'. Did you mean: 'set_epsilon'?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/pvactools/tools/pvacseq/main.py", line 123, in main
    args[0].func.main(args[1])
  File "/usr/local/lib/python3.10/dist-packages/pvactools/tools/pvacseq/run.py", line 142, in main
    pipeline.execute()
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/pipeline.py", line 451, in execute
    self.call_iedb(chunks)
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/pipeline.py", line 349, in call_iedb
    with pymp.Parallel(self.n_threads) as p:
  File "/usr/local/lib/python3.10/dist-packages/pymp/__init__.py", line 148, in __exit__
    raise exc_t(exc_val)
Exception: An error occurred while calling MHCnuggets:
/usr/local/lib/python3.10/dist-packages/keras/src/layers/core/masking.py:47: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_mhcnuggets.py", line 84, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/pvactools/lib/call_mhcnuggets.py", line 61, in main
    predict(args.class_type, tmp_file.name, mhcnuggets_allele(args.allele, args.class_type), output=tmp_output_file.name, rank_output=True)
  File "/usr/local/lib/python3.10/dist-packages/mhcnuggets/src/predict.py", line 93, in predict
    model.compile(loss='mse', optimizer=Adam(lr=0.001))
  File "/usr/local/lib/python3.10/dist-packages/keras/src/optimizers/adam.py", line 60, in __init__
    super().__init__(
  File "/usr/local/lib/python3.10/dist-packages/keras/src/backend/tensorflow/optimizer.py", line 19, in __init__
    super().__init__(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/keras/src/optimizers/base_optimizer.py", line 38, in __init__
    raise ValueError(f"Argument(s) not recognized: {kwargs}")
ValueError: Argument(s) not recognized: {'lr': 0.001}

Output files

No response

susannasiebert commented 4 months ago

Thank you for this report. This is a problem specifically with the prediction algorithms that use tensorflow. Since pVACtools has no direct dependency on tensorflow, we have decided to not pin a specific version for this package. It is up to the prediction algorithms themselves to pin the tensorflow versions they are compatible with or update their code to work with the latest tensorflow version. As such, we recommend that you put in a ticket with MHCflurry and any other prediction algorithm that uses tensorflow. As you noted, users can downgrade/pin tensorflow<=2.15.1 or older in their environment as a workaround until each prediction algorithm updates their code.

Stikus commented 4 months ago

@susannasiebert Thanks for fast answer, but I have a question:
For now pVACtools pins mhcflurry==2.0.6, but latest release is 2.1.0 - https://github.com/openvax/mhcflurry/releases/tag/v2.1.0, and it has some fixes about tensorflow.

Moreover - for now 2.1.0 doesn't have problematic line:
https://github.com/openvax/mhcflurry/blob/v2.1.0/mhcflurry/common.py#L80

Here is old version, used in pVACtools: https://github.com/openvax/mhcflurry/blob/v2.0.6b/mhcflurry/common.py#L131

    if num_threads:
        config.inter_op_parallelism_threads = num_threads
        config.intra_op_parallelism_threads = num_threads
    session = tensorflow.compat.v1.Session(config=config)
    tensorflow.compat.v1.disable_v2_behavior()
    tensorflow.compat.v1.keras.backend.set_session(session)

Maybe you should update mhcflurry first?

susannasiebert commented 4 months ago

You bring up a good point. Unfortunately, MHCflurry 2.1.0 is incompatible with Python 3.7 because the pinned tensorflow 2.12.0 isn't available for that Python version. For internal reason, we have to continue supporting Python 3.7. I will talk with the team how to move forward.

timodonnell commented 4 months ago

Yeah I was disappointed to lose python 3.7 support due to the new tensorflow versions. I just released mhcflurry 2.1.1 which pins the TF version to < 2.16, but that won't fix your issue with the python versions unfortunately. A possible workaround for the short term could be to stick to the version of mhcflurry you are already using but add your own tensorflow dependency pinned to something that works. I think that would be a tf version in the range 2.2.0 - 2.9.1 for mhcflurry 2.0.6 based on what was pinned then and what tf had been released on that date but haven't tested this.

susannasiebert commented 4 months ago

Thank you @timodonnell, that is what we are leaning toward. Would it be at all possible to release a mhcflurry 2.0.7 just to pin the TF version to < 2.16? I believe that should work with python 3.7 since it will just pick the latest TF version that fits that requirement and is compatible with python 3.7?

timodonnell commented 4 months ago

That's a clever idea. Unfortunately I redid our pypi package release stuff since that release though and I don't think I'll have time to figure out how to do a non latest release using the old scripts, at least right now. Would it work to just add tensorflow as a dependency to pvactools until you are able to move to a later python version? That would also let you pick a version that all the various tools can work with rather than relying on the dependency manager to sort it out.

susannasiebert commented 4 months ago

No problem at all. I know how that goes but appreciate you entertaining the idea. I agree that us pinning the TF version is the way to go.