kundajelab / bpnet

Toolkit to train base-resolution deep neural networks on functional genomics data and to interpret them
http://bit.ly/bpnet-colab
MIT License
142 stars 35 forks source link

"No object to concatenate" error #20

Closed eyalbenda closed 3 years ago

eyalbenda commented 3 years ago

Hi,

I'm trying to test bpnet on C. elegans CHIP-seq data from encode. I made bw files from the encode bam files. I then made the following modifications to bpnet9 (because of C. elegans genome):

exclude_chr=["chrM"]
valid_chr = ['chrI']
test_chr = ['chrII', 'chrIII', 'chrIV',
            'chrX', 'chrV']```

I made the following yaml file:

fasta_file: /Users/termivac/Documents/10xThreeSpecies/ERC_prep/ce11/ce11.fa  # reference genome fasta file
task_specs:  # specifies multiple tasks (e.g. Oct4, Sox2 Nanog)

  CEH83: # Nanog is the task name
    tracks:
      - /Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_test.pos.bw
      - /Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_test.neg.bw

bias_specs:  # specifies multiple bias tracks
  input:  # first bias track
    tracks:  # can specify multiple tracks
      - /Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_input.pos.bw
      - /Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_input.neg.bw
    tasks:  # applies to Oct4, Sox2, Nanog tasks
      - CEH83

I'm running bpnet using the following command:

train --premade=bpnet9 --vmtouch CEH83.yml CEH83_output

I'm getting what looks like a parsing error:

Using TensorFlow backend.
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

2021-03-20 15:51:24,395 [WARNING] From /Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

2021-03-20 15:51:25,029 [INFO] Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2021-03-20 15:51:25,029 [INFO] NumExpr defaulting to 8 threads.
/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/bpnet/plot/heatmaps.py:6: MatplotlibDeprecationWarning:
The mpl_toolkits.axes_grid1.colorbar module was deprecated in Matplotlib 3.2 and will be removed two minor releases later. Use matplotlib.colorbar instead.
  from mpl_toolkits.axes_grid1.colorbar import colorbar
/Users/termivac/Documents/10xThreeSpecies/ERC_prep/ce11/ce11.fa
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 24974/24974

           Files: 1
     Directories: 0
   Touched Pages: 24974 (97M)
         Elapsed: 0.036978 seconds
/Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_test.pos.bw
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 27285/27285

           Files: 1
     Directories: 0
   Touched Pages: 27285 (106M)
         Elapsed: 0.046646 seconds
/Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_test.neg.bw
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 27285/27285

           Files: 1
     Directories: 0
   Touched Pages: 27285 (106M)
         Elapsed: 0.048614 seconds
/Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_input.pos.bw
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 3533/3533

           Files: 1
     Directories: 0
   Touched Pages: 3533 (13M)
         Elapsed: 0.006081 seconds
/Users/termivac/Documents/10xThreeSpecies/ERC_prep/tf_tracks/ceh-83_input.neg.bw
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 3533/3533

           Files: 1
     Directories: 0
   Touched Pages: 3533 (13M)
         Elapsed: 0.006723 seconds
INFO [03-20 15:51:26] Using gpu: 0, memory fraction: 0.45
2021-03-20 15:51:26.372802: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO [03-20 15:51:26] Using the following premade configuration: bpnet9
TF-MoDISco is using the TensorFlow backend.
Traceback (most recent call last):
  File "/Users/termivac/anaconda3/envs/bpnet/bin/bpnet", line 8, in <module>
    sys.exit(main())
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/bpnet/__main__.py", line 38, in main
    argh.dispatch(parser)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/bpnet/cli/train.py", line 697, in bpnet_train
    gpu=gpu)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 1009, in gin_wrapper
    new_kwargs = copy.deepcopy(new_kwargs)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/copy.py", line 161, in deepcopy
    y = copier(memo)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 381, in __deepcopy__
    return self._scoped_configurable_fn()
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 1069, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 1046, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/bpnet/datasets.py", line 477, in bpnet_data
    interval_transformer=interval_transformer),
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 1069, in gin_wrapper
    utils.augment_exception_message_and_reraise(e, err_str)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise
    raise proxy.with_traceback(exception.__traceback__) from None
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/gin/config.py", line 1046, in gin_wrapper
    return fn(*new_args, **new_kwargs)
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/bpnet/datasets.py", line 278, in __init__
    for task, task_spec in self.ds.task_specs.items()
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 284, in concat
    sort=sort,
  File "/Users/termivac/anaconda3/envs/bpnet/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 331, in __init__
    raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
  In call to configurable 'StrandedProfile' (<class 'bpnet.datasets.StrandedProfile'>)
  In call to configurable 'bpnet_data' (<function bpnet_data at 0x7fe77139e488>)

I'd appreciate any help in getting the software to work.

Best,

Eyal

Avsecz commented 3 years ago

Hi,

you are missing the peaks entry (example). If you want to train it genome-wide, then you can generate a bed file that tiles the whole genome.

Best Ziga

eyalbenda commented 3 years ago

Hi Ziga, Thanks for the help! I made a bed file from chrom_sizes (so just one last per chromosome, start to end). Would that work ok or do I need to split the chromosomes into segments in the bed file?

Avsecz commented 3 years ago

You need to split the chromosome into segments. Every segment will be one training example.

eyalbenda commented 3 years ago

Thank you!