Input multiple sequences per image

google-research / pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)

Apache License 2.0

857 stars 71 forks source link

in user code: File "/pix2seq/architectures/transformers.py", line 684, in call * _, seqlen = get_shape(tokens) ValueError: too many values to unpack (expected 2) Call arguments received by layer "ar_decoder" " f"(type AutoregressiveDecoder): • tokens=tf.Tensor(shape=(32, 3, 500), dtype=int64) • encoded=tf.Tensor(shape=(32, 1600, 512), dtype=float32) • training=True Call arguments received by layer "model" " f"(type Model): • images=tf.Tensor(shape=(32, 640, 640, 3), dtype=float32) • seq=tf.Tensor(shape=(32, 500), dtype=int64) • training=True`

One workaround is to reshape the tensor from (bsz, instances, seqlen) into (bsz * instances, seqlen) for the model and then reshape back after done. Hope this helps.

On Tue, Sep 13, 2022 at 6:26 PM Qihao Liu @.***> wrote:

Hello, fantastic work on Pix2seq v1 and v2.

I have a question regarding handling multiple sequences for one image. In the following code, it seems that we can input multiple sequences by using a tensor with size (bsz, instances, seqlen). The current version use seq with size (bsz, seqlen).

https://github.com/google-research/pix2seq/blob/6d45f77fcbb1905aca3e42678a2a079907ad17d0/models/ar_model.py#L84

I tried this but it failed:

`ValueError: Exception encountered when calling layer "ar_decoder" " f"(type AutoregressiveDecoder).
in user code:

    File "/pix2seq/architectures/transformers.py", line 684, in call  *

        _, seqlen = get_shape(tokens)

    ValueError: too many values to unpack (expected 2)

Call arguments received by layer "ar_decoder" "                 f"(type AutoregressiveDecoder):

  • tokens=tf.Tensor(shape=(32, 3, 500), dtype=int64)

  • encoded=tf.Tensor(shape=(32, 1600, 512), dtype=float32)

  • training=True
Call arguments received by layer "model" " f"(type Model):

• images=tf.Tensor(shape=(32, 640, 640, 3), dtype=float32)

• seq=tf.Tensor(shape=(32, 500), dtype=int64)

• training=True`

Do you have any idea? Have you tried to use multiple sequences and how to do that?

Thank you!!!

— Reply to this email directly, view it on GitHub https://github.com/google-research/pix2seq/issues/17, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKERUMAFTIFTLFXBIDVQFTV6ESUNANCNFSM6AAAAAAQL6L3OA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

google-research / pix2seq

Input multiple sequences per image #17