JaredJGartner / SB_neoantigen_Models

Other
7 stars 2 forks source link

Can not repeat results in examples/ #2

Closed RysBen closed 10 months ago

RysBen commented 10 months ago

Hi there,

Firstly, I ran GenerateScores.py with below commands:

python ../src/GenerateScores.py ../examples/nmer_test_input.xlsx HLA-A02:01 HLA-A03:01 HLA-B13:02 HLA-B15:01 HLA-C05:01 HLA-C06:02

Then, I compared output nmer_test_input_scored.xlsx with ../examples/nmer_test_input_scored.xlsx, and found their Nmer score was very different.

image

image

Any suggestion would be appreciated.

RysBen commented 10 months ago

Here was the std out

['HLA-A02:01', 'HLA-A03:01', 'HLA-B13:02', 'HLA-B15:01', 'HLA-C05:01', 'HLA-C06:02']
running MHCflurry
Using TensorFlow backend.
WARNING:tensorflow:From /work/software/miniconda3/envs/mhcflurry-env/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2023-10-16 14:31:05.053438: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2023-10-16 14:31:05.065651: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2400000000 Hz
2023-10-16 14:31:05.070436: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e8cd153950 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-10-16 14:31:05.070489: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /work/software/miniconda3/envs/mhcflurry-env/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

MHCflurry complete
running netMHCSTABpan
/work/software/SB_neoantigen_Models/src/prediction_modules.py:103: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  fasta_file['mut_len'] = fasta_file['Mutant peptide'].str.len()
/work/software/SB_neoantigen_Models/src/prediction_modules.py:104: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  fasta_file['wt_len'] = fasta_file['Wild type peptide'].str.len()
/work/software/SB_neoantigen_Models/src/prediction_modules.py:106: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  fasta_file.drop_duplicates(subset= ['Mutant peptide', 'Wild type peptide'], inplace = True)
netMHCstabpan complete
Running IEDB immunogenicity score
completed IEDB immunogenicity score
Scoring MMP models
MMPs scored
Scoring NMers
/work/software/miniconda3/envs/mhcflurry-env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:58: DeprecationWarning: Class Imputer is deprecated; Imputer was deprecated in version 0.20 and will be removed in 0.22. Import impute.SimpleImputer from sklearn instead.
  warnings.warn(msg, category=DeprecationWarning)
JaredJGartner commented 10 months ago

Would you be able to share with me your environment, just an export yml will be fine. Also your output file, it seems that you aren't getting correct values from your mmp scores either.

RysBen commented 10 months ago

Hi Jared,

Here is my environment:

name: mhcflurry-env
channels:
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_kmp_llvm
  - absl-py=0.9.0=py36_0
  - astor=0.7.1=py_0
  - blas=2.16=openblas
  - c-ares=1.15.0=h516909a_1001
  - ca-certificates=2020.4.5.1=hecc5488_0
  - certifi=2020.4.5.1=py36h9f0ad1d_0
  - grpcio=1.23.0=py36h769ab6c_1
  - h5py=2.10.0=nompi_py36h513d04c_102
  - hdf5=1.10.5=nompi_h3c11f04_1104
  - importlib-metadata=1.6.0=py36h9f0ad1d_0
  - keras-applications=1.0.8=py_1
  - keras-preprocessing=1.1.0=pyhd8ed1ab_0
  - krb5=1.17.1=h2fd8d38_0
  - ld_impl_linux-64=2.34=hc38a660_9
  - libblas=3.8.0=16_openblas
  - libcblas=3.8.0=16_openblas
  - libcurl=7.69.1=hf7181ac_0
  - libedit=3.1.20170329=hf8c457e_1001
  - libffi=3.2.1=he1b5a44_1007
  - libgcc-ng=9.2.0=h24d8f2e_2
  - libgfortran-ng=7.5.0=h14aa051_20
  - libgfortran4=7.5.0=h14aa051_20
  - liblapack=3.8.0=16_openblas
  - liblapacke=3.8.0=16_openblas
  - libopenblas=0.3.9=h5ec1e0e_0
  - libpng=1.6.37=h21135ba_2
  - libprotobuf=3.12.1=h8b12597_0
  - libssh2=1.9.0=hab1572f_5
  - libstdcxx-ng=9.2.0=hdf63c60_2
  - libzlib=1.2.11=h36c2ea0_1013
  - llvm-openmp=10.0.0=hc9558a2_0
  - markdown=3.2.2=py_0
  - mock=4.0.2=py36h9f0ad1d_1
  - ncurses=6.1=hf484d3e_1002
  - numpy=1.18.4=py36h7314795_0
  - openssl=1.1.1g=h516909a_1
  - pip=20.1.1=py_1
  - protobuf=3.12.1=py36h831f99a_0
  - python=3.6.10=h8356626_1011_cpython
  - python_abi=3.6=2_cp36m
  - readline=8.0=h46ee950_1
  - scikit-learn=0.20.1=py36h22eb022_0
  - scipy=1.4.1=py36h2d22cac_3
  - setuptools=46.4.0=py36h9f0ad1d_0
  - six=1.15.0=pyh9f0ad1d_0
  - sqlite=3.30.1=hcee41ef_0
  - termcolor=1.1.0=pyhd8ed1ab_3
  - tk=8.6.10=h21135ba_1
  - werkzeug=1.0.1=pyh9f0ad1d_0
  - wheel=0.34.2=py_1
  - xlrd=1.2.0=pyh9f0ad1d_1
  - xlsxwriter=1.2.8=py_0
  - xz=5.2.5=h516909a_1
  - zipp=3.1.0=py_0
  - zlib=1.2.11=h36c2ea0_1013
  - pip:
      - appdirs==1.4.4
      - future==0.18.2
      - gast==0.2.2
      - google-pasta==0.2.0
      - joblib==0.15.1
      - keras==2.3.1
      - mhcflurry==1.6.1
      - mhcnames==0.4.8
      - np-utils==0.5.12.1
      - opt-einsum==3.2.1
      - pandas==1.0.3
      - python-dateutil==2.8.1
      - pytz==2020.1
      - pyyaml==5.3.1
      - tensorboard==1.15.0
      - tensorflow==1.15.3
      - tensorflow-estimator==1.15.1
      - threadpoolctl==2.0.0
      - tqdm==4.46.0
      - wrapt==1.12.1
prefix: /work/software/miniconda3/envs/mhcflurry-env

Attached is the output file.

nmer_test_input_scored.xlsx

RysBen commented 10 months ago

Hi Jared,

The netMHCstabpan-1.0 didn't produce results as expected, and I'm troubleshooting this.

Thanks for your reply.