Question about Basenji2 receptive field

pkathail commented 1 year ago

Hi,

I was using the pre-trained Basenji2 human model to make some predictions and was wondering what the receptive field of the model is? I tried obtaining the receptive field for different input bins by computing the number of input sequence elements with non-zero gradients (code below, based off of your gradients() function). Based on that, I got that the receptive field for most bins is ~55kb and for bins near the edges it's <40kb, is that correct?

Bin	Receptive Field
0	35,716 bp
1	35,844 bp
2	35,972 bp
3	36,100 bp
300	54,920 bp
448	54,920 bp

params_file = "{path}/basenji2/models/params_human.json"
model_file = "{path}/basenji2/model_human.h5",
targets_file = "{path}/basenji2/human/targets.txt"
rc = True
shifts = [-1,0,1]

# load model parameters
with open(params_file) as params_open:
    params = json.load(params_open)
params_model = params['model']
params_train = params['train']

# load targets
targets_df = pd.read_csv(targets_file, sep='\t', index_col=0)
target_slice = targets_df.index

# load model
model = seqnn.SeqNN(params_model)
model.restore(model_file)
model.build_slice(target_slice)
model.build_ensemble(rc, shifts)
model_ensemble = model.ensemble

# verify tensor shape
seq_1hot = tf.convert_to_tensor(seq_1hot, dtype=tf.float32)
if len(seq_1hot.shape) < 3:
    seq_1hot = tf.expand_dims(seq_1hot, axis=0)

# batching parameters
batch_size = 1
num_targets = model_ensemble.output_shape[-1]
num_batches = int(np.ceil(num_targets / batch_size))

for bin_ in [0, 1, 2, 3, 300, 448]:
  ti_start = 0
  grads = []
  bi = 0
  # sequence input
  sequence = tf.keras.Input(shape=(model_ensemble.seq_length, 4), name='sequence')

  # predict
  predictions = model_ensemble(sequence)

  # slice
  ti_end = min(num_targets, ti_start + batch_size)
  target_slice = np.arange(ti_start, ti_end)
  predictions_slice = tf.gather(predictions, target_slice, axis=-1)
  predictions_slice = tf.gather(predictions_slice, bin_, axis=-2)

  # replace model
  model_batch = tf.keras.Model(inputs=sequence, outputs=predictions_slice)

  with tf.GradientTape() as tape:
      tape.watch(seq_1hot)
      # predict
      preds = model_batch(seq_1hot, training=False)

      # sum across positions
      preds = tf.reduce_sum(preds, axis=-2)

      # compute jacobian
      grads = tape.jacobian(preds, seq_1hot)
      grads = tf.squeeze(grads)
  print(bin_, np.unique(np.where(grads.numpy() != 0)[0]).shape)

davek44 commented 1 year ago

Hi Pooja, that sounds right to me; those are the numbers I remember from the paper. At the boundaries, you lose the opportunity to look in both directions, so the number drops. The dilated convolution layers are scaling the dilation rate by 1.5x each time, but the values get rounded up to integers, which makes it hard to write out the math with a simple formula.

pkathail commented 1 year ago

@davek44 thank you!

calico / basenji

Question about Basenji2 receptive field #163