kipoi / models

Model zoo for genomics
http://kipoi.org
MIT License
163 stars 58 forks source link

Update input shape #327

Closed bernardo-de-almeida closed 2 years ago

bernardo-de-almeida commented 2 years ago

The input shape is incorrect since the model uses (249, 4) - and it's showing the following error "ValueError: Error when checking input: expected input_12 to have 3 dimensions, but got array with shape (4, 249, 4, 1)"

haimasree commented 2 years ago

Hi @bernardo-de-almeida May I ask where exactly you are getting this error? We have nightly tests in place that creates a model specific conda environment from scratch and runs kipoi test <model-name> --source=kipoi and in DeepSTARR's case also tests the resulting values against the predictions stored in here. An example run would be from yesterday's nightly test. If you dont have a circleci account this is what it says -

4/8 - model: DeepSTARR
--------------------
INFO [kipoi.cli.env] Writing environment file: /tmp/kipoi/envfiles/09950047
INFO [kipoi.cli.env] Loading model: DeepSTARR description
INFO [kipoi.sources] Update /home/circleci/repo/
Already up to date.
INFO [kipoi.cli.env] kipoiseq not installed. Using default kipoiseq dependencies for the dataloader: kipoiseq.dataloaders.SeqIntervalDl
INFO [kipoi.cli.env] Environment name: test-kipoi-DeepSTARR
INFO [kipoi.cli.env] Output env file: /tmp/kipoi/envfiles/09950047/test-kipoi-DeepSTARR.yaml
INFO [kipoi.cli.env] Done writing the environment file!
INFO [kipoi.cli.env] Creating conda env from file: /tmp/kipoi/envfiles/09950047/test-kipoi-DeepSTARR.yaml
INFO [kipoi.cli.env] Done!

Activate the environment via:
conda activate test-kipoi-DeepSTARR
INFO [kipoi.sources] Update /home/circleci/repo/
Already up to date.
INFO [kipoi.model] Downloading model arguments arch from https://zenodo.org/record/5502060/files/DeepSTARR.model.json?download=1
Downloading https://zenodo.org/record/5502060/files/DeepSTARR.model.json?download=1 to /home/circleci/repo/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872
16.4kB [00:01, 9.88kB/s]                
INFO [kipoi.model] Downloading model arguments weights from https://zenodo.org/record/5502060/files/DeepSTARR.model.h5?download=1
Downloading https://zenodo.org/record/5502060/files/DeepSTARR.model.h5?download=1 to /home/circleci/repo/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c
2.56MB [00:02, 1.05MB/s]                
2022-05-27 09:43:17.209117: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-05-27 09:43:17.209148: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-05-27 09:43:18.162728: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-05-27 09:43:18.162760: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-05-27 09:43:18.162783: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-172-28-27-83): /proc/driver/nvidia/version does not exist
2022-05-27 09:43:18.163021: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO [kipoi.model] successfully loaded model architecture from <_io.TextIOWrapper name='/home/circleci/repo/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872' mode='r' encoding='utf-8'>
2022-05-27 09:43:18.261243: W tensorflow/core/util/tensor_slice_reader.cc:96] Could not open /home/circleci/repo/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
INFO [kipoi.model] successfully loaded model weights from /home/circleci/repo/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c
INFO [kipoi.pipeline] dataloader.output_schema is compatible with model.schema
INFO [kipoi.pipeline] Initialized data generator. Running batches...
Downloading https://raw.githubusercontent.com/kipoi/kipoiseq/master/tests/data/intervals_51bp.tsv to /home/circleci/repo/DeepSTARR/downloaded/example_files/intervals_file
8.19kB [00:00, 519kB/s]
Downloading https://raw.githubusercontent.com/kipoi/kipoiseq/master/tests/data/hg38_chr22_32000000_32300000.fa to /home/circleci/repo/DeepSTARR/downloaded/example_files/fasta_file
303kB [00:00, 20.2MB/s]
INFO [kipoi.pipeline] Returned data schema correct
100% 3/3 [00:00<00:00,  4.77it/s]
INFO [kipoi.pipeline] predict_example done!
Downloading https://zenodo.org/record/6553385/files/DeepSTARR.predictions.h5?download=1 to /home/circleci/repo/DeepSTARR/downloaded/model_files/test.expect.h5
98.3kB [00:01, 62.5kB/s]                
INFO [kipoi.cli.main] Testing if the predictions match the expected ones in the file: /home/circleci/repo/DeepSTARR/downloaded/model_files/test.expect.h5
INFO [kipoi.cli.main] Desired precision (number of matching decimal places): 4
3it [00:00,  5.64it/s]           
INFO [kipoi.cli.main] All predictions match
INFO [kipoi.cli.main] Successfully ran test_predict

On the other hand, if I change input shape to what you suggested I end up getting this warning although it fnishes succesfully -

ArraySchema mismatch
Array shapes don't match for the fields:
--
Model
--
shape: (249, 4)
doc: DNA sequence

--
Dataloader
--
shape: (249, 4, 1)
doc: One-hot encoded DNA sequence
name: seq
special_type: DNASeq
associated_metadata:
- ranges

--
Provided shape (without the batch axis): (249, 4, 1), expected shape: (249, 4) 

Please write back with a test case where you are getting this error and I will look into it. I will look into why test_new_models is failing later as well.

Thanks!

bernardo-de-almeida commented 2 years ago

Many thanks @haimasree for looking into this!

So far I am having two issues:

(1) When I create the conda environment with all dependencies installed by kipoi, the test fails due to kipoiseq.py. See below an example and respective output

`kipoi env create DeepSTARR source activate kipoi-DeepSTARR

kipoi test DeepSTARR --source=kipoi`

INFO [kipoi.sources] Update .kipoi/models/ Already up-to-date. Traceback (most recent call last): File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 134, in load_obj module = importlib.import_module(module_name) File "envs/kipoi-DeepSTARR/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 843, in exec_module File "", line 219, in _call_with_frames_removed File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/init.py", line 11, in from . import dataloaders File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/dataloaders/init.py", line 1, in from .sequence import * File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/dataloaders/sequence.py", line 3, in import pyranges as pr File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/pyranges/init.py", line 137, in import pyranges.genomicfeatures as gf File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/pyranges/genomicfeatures.py", line 7, in from sorted_nearest.src.introns import find_introns File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/sorted_nearest/init.py", line 7, in from sorted_nearest.src.k_nearest_ties import get_all_ties, get_different_ties ImportError: cannot import name 'get_all_ties' from 'sorted_nearest.src.k_nearest_ties' (envs/kipoi-DeepSTARR/lib/python3.8/site-packages/sorted_nearest/src/k_nearest_ties.cpython-38-x86_64-linux-gnu.so)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 140, in load_obj spec.loader.exec_module(module) File "", line 839, in exec_module File "", line 975, in get_code File "", line 1032, in get_data FileNotFoundError: [Errno 2] No such file or directory: '.kipoi/models/DeepSTARR/kipoiseq.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "envs/kipoi-DeepSTARR/bin/kipoi", line 8, in sys.exit(main()) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/main.py", line 107, in main command_fn(args.command, sys.argv[2:]) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/cli/main.py", line 69, in cli_test mh = kipoi.get_model(args.model, args.source) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/model.py", line 125, in get_model default_dataloader = md.default_dataloader.get() File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/specs.py", line 805, in get obj = load_obj(self.defined_as) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 143, in load_obj raise ImportError("object {} couldn't be imported. Error {}".format(obj_import, str(e)))

ImportError: object kipoiseq.dataloaders.SeqIntervalDl couldn't be imported. Error [Errno 2] No such file or directory: '.kipoi/models/DeepSTARR/kipoiseq.py'

bernardo-de-almeida commented 2 years ago

(2) If I create the environment myself with the following code, it works until the error with the input shape. See below the test case you requested

`

Create a new conda environment

conda create --name test_env2 python=3.7 tensorflow=1.14.0 keras=2.2.4 source activate test_env2

pip install 'h5py<3.0.0' pip install kipoiseq pip install pybedtools

Test the model

kipoi test DeepSTARR --source=kipoi `

INFO [kipoi.sources] Update .kipoi/models/ Already up-to-date. INFO [kipoi.model] Downloading model arguments arch from https://zenodo.org/record/5502060/files/DeepSTARR.model.json?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872 INFO [kipoi.model] Downloading model arguments weights from https://zenodo.org/record/5502060/files/DeepSTARR.model.h5?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. INFO [kipoi.model] successfully loaded model architecture from <_io.TextIOWrapper name='.kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872' mode='r' encoding='utf-8'> 2022-05-28 01:25:07.834609: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags. 2022-05-28 01:25:07.852316: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1995310000 Hz 2022-05-28 01:25:07.857662: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ff1e3c9990 executing computations on platform Host. Devices: 2022-05-28 01:25:07.857708: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2022-05-28 01:25:07.858098: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. 2022-05-28 01:25:08.209763: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. INFO [kipoi.model] successfully loaded model weights from .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c INFO [kipoi.pipeline] dataloader.output_schema is compatible with model.schema INFO [kipoi.pipeline] Initialized data generator. Running batches... INFO [kipoi.specs] Example file for argument intervals_file already exists INFO [kipoi.specs] Example file for argument fasta_file already exists INFO [kipoi.pipeline] Returned data schema correct 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "envs/test_env2/bin/kipoi", line 8, in sys.exit(main()) File "envs/test_env2/lib/python3.7/site-packages/kipoi/main.py", line 107, in main command_fn(args.command, sys.argv[2:]) File "envs/test_env2/lib/python3.7/site-packages/kipoi/cli/main.py", line 81, in cli_test mh.pipeline.predict_example(batch_size=args.batch_size, output_file=args.output, **config_kwargs) File "envs/test_env2/lib/python3.7/site-packages/kipoi/pipeline.py", line 133, in predict_example pred_batch = self.model.predict_on_batch(batch['inputs']) File "envs/test_env2/lib/python3.7/site-packages/kipoi/model.py", line 390, in predict_on_batch return self.model.predict_on_batch(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 1268, in predict_onbatch x, , _ = self._standardize_user_data(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 751, in _standardize_user_data exception_prefix='input') File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training_utils.py", line 128, in standardize_input_data 'with shape ' + str(data_shape))

ValueError: Error when checking input: expected input_12 to have 3 dimensions, but got array with shape (10, 249, 4, 1)

haimasree commented 2 years ago

Many thanks @haimasree for looking into this!

So far I am having two issues:

(1) When I create the conda environment with all dependencies installed by kipoi, the test fails due to kipoiseq.py. See below an example and respective output

`kipoi env create DeepSTARR source activate kipoi-DeepSTARR

kipoi test DeepSTARR --source=kipoi`

INFO [kipoi.sources] Update .kipoi/models/ Already up-to-date. Traceback (most recent call last): File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 134, in load_obj module = importlib.import_module(module_name) File "envs/kipoi-DeepSTARR/lib/python3.8/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 843, in exec_module File "", line 219, in _call_with_frames_removed File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/init.py", line 11, in from . import dataloaders File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/dataloaders/init.py", line 1, in from .sequence import * File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoiseq/dataloaders/sequence.py", line 3, in import pyranges as pr File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/pyranges/init.py", line 137, in import pyranges.genomicfeatures as gf File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/pyranges/genomicfeatures.py", line 7, in from sorted_nearest.src.introns import find_introns File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/sorted_nearest/init.py", line 7, in from sorted_nearest.src.k_nearest_ties import get_all_ties, get_different_ties ImportError: cannot import name 'get_all_ties' from 'sorted_nearest.src.k_nearest_ties' (envs/kipoi-DeepSTARR/lib/python3.8/site-packages/sorted_nearest/src/k_nearest_ties.cpython-38-x86_64-linux-gnu.so)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 140, in load_obj spec.loader.exec_module(module) File "", line 839, in exec_module File "", line 975, in get_code File "", line 1032, in get_data FileNotFoundError: [Errno 2] No such file or directory: '.kipoi/models/DeepSTARR/kipoiseq.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "envs/kipoi-DeepSTARR/bin/kipoi", line 8, in sys.exit(main()) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/main.py", line 107, in main command_fn(args.command, sys.argv[2:]) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/cli/main.py", line 69, in cli_test mh = kipoi.get_model(args.model, args.source) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/model.py", line 125, in get_model default_dataloader = md.default_dataloader.get() File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi/specs.py", line 805, in get obj = load_obj(self.defined_as) File "envs/kipoi-DeepSTARR/lib/python3.8/site-packages/kipoi_utils/utils.py", line 143, in load_obj raise ImportError("object {} couldn't be imported. Error {}".format(obj_import, str(e)))

ImportError: object kipoiseq.dataloaders.SeqIntervalDl couldn't be imported. Error [Errno 2] No such file or directory: '.kipoi/models/DeepSTARR/kipoiseq.py'

Hi @bernardo-de-almeida Thanks for reporting. Our nightly tests caught this issue. I have described the problem here and provided the fix here. In short, kipoiseq depends on sorted-nearest whose latest version 0.0.35 seems to be broken. I am hoping kipoiseq will get a fresh release in pypi which will automatically fix the failing nightly tests without having to manually pin sorted-nearest across five kipoi repos including DeepSTARR. However, I wanted to test it out just to double check. As it turns out DeepSTARR also is affected by a newer version of another package protobuf. I have pinned these two packages here and test seems to be passing as before. Check here.

I will wait until end of today for a new release of kipoiseq after which I will start pinning sorted-nearest and necessary others. In the mean time, if you want to test locally please add this to model.yaml

    pip:
      - keras==2.7.0
      - tensorflow==2.7.0
      - sorted-nearest==0.0.33
      - protobuf==3.20

And use the following (assuming you are at models dir)

kipoi env create DeepSTARR --source=dir
source activate dir-DeepSTARR
kipoi test DeepSTARR --source=kipoi

Hope this helps.

haimasree commented 2 years ago

(2) If I create the environment myself with the following code, it works until the error with the input shape. See below the test case you requested

`

Create a new conda environment

conda create --name test_env2 python=3.7 tensorflow=1.14.0 keras=2.2.4 source activate test_env2

pip install 'h5py<3.0.0' pip install kipoiseq pip install pybedtools

Test the model

kipoi test DeepSTARR --source=kipoi `

INFO [kipoi.sources] Update .kipoi/models/ Already up-to-date. INFO [kipoi.model] Downloading model arguments arch from https://zenodo.org/record/5502060/files/DeepSTARR.model.json?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872 INFO [kipoi.model] Downloading model arguments weights from https://zenodo.org/record/5502060/files/DeepSTARR.model.h5?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. INFO [kipoi.model] successfully loaded model architecture from <_io.TextIOWrapper name='.kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872' mode='r' encoding='utf-8'> 2022-05-28 01:25:07.834609: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags. 2022-05-28 01:25:07.852316: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1995310000 Hz 2022-05-28 01:25:07.857662: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ff1e3c9990 executing computations on platform Host. Devices: 2022-05-28 01:25:07.857708: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2022-05-28 01:25:07.858098: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. 2022-05-28 01:25:08.209763: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. INFO [kipoi.model] successfully loaded model weights from .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c INFO [kipoi.pipeline] dataloader.output_schema is compatible with model.schema INFO [kipoi.pipeline] Initialized data generator. Running batches... INFO [kipoi.specs] Example file for argument intervals_file already exists INFO [kipoi.specs] Example file for argument fasta_file already exists INFO [kipoi.pipeline] Returned data schema correct 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "envs/test_env2/bin/kipoi", line 8, in sys.exit(main()) File "envs/test_env2/lib/python3.7/site-packages/kipoi/main.py", line 107, in main command_fn(args.command, sys.argv[2:]) File "envs/test_env2/lib/python3.7/site-packages/kipoi/cli/main.py", line 81, in cli_test mh.pipeline.predict_example(batch_size=args.batch_size, output_file=args.output, **config_kwargs) File "envs/test_env2/lib/python3.7/site-packages/kipoi/pipeline.py", line 133, in predict_example pred_batch = self.model.predict_on_batch(batch['inputs']) File "envs/test_env2/lib/python3.7/site-packages/kipoi/model.py", line 390, in predict_on_batch return self.model.predict_on_batch(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 1268, in predict_onbatch x, , _ = self._standardize_user_data(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 751, in _standardize_user_data exception_prefix='input') File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training_utils.py", line 128, in standardize_input_data 'with shape ' + str(data_shape))

ValueError: Error when checking input: expected input_12 to have 3 dimensions, but got array with shape (10, 249, 4, 1)

Interesting. I am wondering why this error does not show up in the environment described by model.yaml. Any specific reason you are using those versions of tensorflow and keras. Did you use these exact versions while training DeepSTARR? As described in my comment above, once kipoi env create DeepSTARR --source=kipoi gets resolved everything else will fall back into space. So, this pr would no longer be necessary unless you disagree.

haimasree commented 2 years ago

Just fyi, all kipoi mdoels come with ready-to-use docker and singularity containers now. Ofcourse by the nature of containerization, it is completely exempt from these type of issues. If you are interested checkout the singularity and docker tab here.

bernardo-de-almeida commented 2 years ago

Hi @bernardo-de-almeida Thanks for reporting. Our nightly tests caught this issue. I have described the problem here and provided the fix here. In short, kipoiseq depends on sorted-nearest whose latest version 0.0.35 seems to be broken. I am hoping kipoiseq will get a fresh release in pypi which will automatically fix the failing nightly tests without having to manually pin sorted-nearest across five kipoi repos including DeepSTARR. However, I wanted to test it out just to double check. As it turns out DeepSTARR also is affected by a newer version of another package protobuf. I have pinned these two packages here and test seems to be passing as before. Check here.

I will wait until end of today for a new release of kipoiseq after which I will start pinning sorted-nearest and necessary others. In the mean time, if you want to test locally please add this to model.yaml

    pip:
      - keras==2.7.0
      - tensorflow==2.7.0
      - sorted-nearest==0.0.33
      - protobuf==3.20

And use the following (assuming you are at models dir)

kipoi env create DeepSTARR --source=dir
source activate dir-DeepSTARR
kipoi test DeepSTARR --source=kipoi

Hope this helps.

Hi @haimasree , thanks a lot! I have tested with the sorted-nearest and protobuf pinned versions and it worked. Will you incorporate these changes later as you said, or do you want me to do it?

bernardo-de-almeida commented 2 years ago

(2) If I create the environment myself with the following code, it works until the error with the input shape. See below the test case you requested `

Create a new conda environment

conda create --name test_env2 python=3.7 tensorflow=1.14.0 keras=2.2.4 source activate test_env2 pip install 'h5py<3.0.0' pip install kipoiseq pip install pybedtools

Test the model

kipoi test DeepSTARR --source=kipoi ` INFO [kipoi.sources] Update .kipoi/models/ Already up-to-date. INFO [kipoi.model] Downloading model arguments arch from https://zenodo.org/record/5502060/files/DeepSTARR.model.json?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872 INFO [kipoi.model] Downloading model arguments weights from https://zenodo.org/record/5502060/files/DeepSTARR.model.h5?download=1 Using downloaded and verified file: .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) envs/test_env2/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. INFO [kipoi.model] successfully loaded model architecture from <_io.TextIOWrapper name='.kipoi/models/DeepSTARR/downloaded/model_files/arch/9b796f79441e53dc75dd79b911fff872' mode='r' encoding='utf-8'> 2022-05-28 01:25:07.834609: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags. 2022-05-28 01:25:07.852316: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1995310000 Hz 2022-05-28 01:25:07.857662: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ff1e3c9990 executing computations on platform Host. Devices: 2022-05-28 01:25:07.857708: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , 2022-05-28 01:25:07.858098: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. 2022-05-28 01:25:08.209763: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. INFO [kipoi.model] successfully loaded model weights from .kipoi/models/DeepSTARR/downloaded/model_files/weights/7e53a9351b2520a4713a5ffdb5f1566c INFO [kipoi.pipeline] dataloader.output_schema is compatible with model.schema INFO [kipoi.pipeline] Initialized data generator. Running batches... INFO [kipoi.specs] Example file for argument intervals_file already exists INFO [kipoi.specs] Example file for argument fasta_file already exists INFO [kipoi.pipeline] Returned data schema correct 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "envs/test_env2/bin/kipoi", line 8, in sys.exit(main()) File "envs/test_env2/lib/python3.7/site-packages/kipoi/main.py", line 107, in main command_fn(args.command, sys.argv[2:]) File "envs/test_env2/lib/python3.7/site-packages/kipoi/cli/main.py", line 81, in cli_test mh.pipeline.predict_example(batch_size=args.batch_size, output_file=args.output, **config_kwargs) File "envs/test_env2/lib/python3.7/site-packages/kipoi/pipeline.py", line 133, in predict_example pred_batch = self.model.predict_on_batch(batch['inputs']) File "envs/test_env2/lib/python3.7/site-packages/kipoi/model.py", line 390, in predict_on_batch return self.model.predict_on_batch(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 1268, in predict_onbatch x, , _ = self._standardize_user_data(x) File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training.py", line 751, in _standardize_user_data exception_prefix='input') File "envs/test_env2/lib/python3.7/site-packages/keras/engine/training_utils.py", line 128, in standardize_input_data 'with shape ' + str(data_shape)) ValueError: Error when checking input: expected input_12 to have 3 dimensions, but got array with shape (10, 249, 4, 1)

Interesting. I am wondering why this error does not show up in the environment described by model.yaml. Any specific reason you are using those versions of tensorflow and keras. Did you use these exact versions while training DeepSTARR? As described in my comment above, once kipoi env create DeepSTARR --source=kipoi gets resolved everything else will fall back into space. So, this pr would no longer be necessary unless you disagree.

Yes these are the exact versions used when training DeepSTARR. But As you said, it wasn't needed. The changes above solved it. Thanks!

haimasree commented 2 years ago

Excellent. I will wait until end of today to hear back from kipoiseq maintainer. If I dont, I will merge these changes to master branch by tomorrow.

haimasree commented 2 years ago

Hi @bernardo-de-almeida, I have migrated all the changes to master branch now and the nightly tests have passed as well yesterday. kipoi env create DeepSTARR --source=kipoi and kipoi test DeepSTARR should work now without any problem. If you think I should still investigate this error with your specified environment I am happy to convert it into an issue and will look at it when I have more time. Let me know.

bernardo-de-almeida commented 2 years ago

Great, many thanks @haimasree !

haimasree commented 2 years ago

Just asking for bookkeeping - should we keep this open until your issue is resolved? Like I said if you think it is necessary, I can convert this into an issue as well and close the pr.

bernardo-de-almeida commented 2 years ago

No, it's fine for now. Thanks 👍

haimasree commented 2 years ago

Great! Closing this pr - feel free to open another issue/pr if there is anything else I can help with.