sayef / fsner

Few-shot Named Entity Recognition
119 stars 6 forks source link

Why do we need the same number of EN examples by support list? #1

Closed piegu closed 2 years ago

piegu commented 2 years ago

Hi,

I just deleted one EN example from your code (the "Horizontal flow wrapper..." example):

supports = [
        [
           # 'Horizontal flow wrapper [E] Pack 403 [/E] features the new retrofit-kit „paper-ON-form“',
           '[E] Paloma Pick-and-Place-Roboter [/E] arranges the bakery products for the downstream tray-forming equipment',
           'Finally, the new [E] Kliklok ACE [/E] carton former forms cartons and trays without the use of glue',
           'We set up our pilot plant with the right [E] FibreForm® [/E] configuration to make prototypes for your marketing tests and package validation',
           'The [E] CAR-T5 [/E] is a reliable, purely mechanically driven cartoning machine for versatile application fields'
        ],
        [
            "[E] Walmart [/E] is a leading e-commerce company",
            "I recently ordered a book from [E] Amazon [/E]",
            "I ordered this from [E] ShopClues [/E]",
            "Fridge can be ordered in [E] Amazon [/E]",
            "[E] Flipkart [/E] started it's journey from zero"
        ]
   ]

... and the code start_prob, end_prob = model(W_query, W_supports) failed.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-5-c3bd7728c44c> in <module>()
      4 W_supports = tokenizer.tokenize([s for support in supports for s in support]).to(device)
      5 
----> 6 start_prob, end_prob = model(W_query, W_supports)
      7 output = tokenizer.extract_entity_from_scores(query, W_query, start_prob, end_prob, thresh=0.50)
      8 

1 frames
/usr/local/lib/python3.7/dist-packages/fsner-0.0.1-py3.7.egg/fsner/model.py in forward(self, W_query, W_supports)
     48 
     49         # reshape from (batch_size*n_exaples_per_entity, 384, 784) to (batch_size, n_exaples_per_entity, 384, 784)
---> 50         S = S.view(q.shape[0], -1, S.shape[1], S.shape[2])
     51 
     52         s_start = S[(W_supports["input_ids"] == 30522).view(S.shape[:3])].view(S.shape[0], -1, 1, S.shape[-1])

RuntimeError: shape '[2, -1, 384, 768]' is invalid for input of size 2654208

However, I don't understand why we should provide the model with the same number of EN examples by entity?

sayef commented 2 years ago

Resolved in this PR huggingface/transformers#13864

piegu commented 2 years ago

Hi @sayef.

I did publish a Colab notebook with your code at https://github.com/sayef/fsner but it does not work.

Link to notebook: https://colab.research.google.com/drive/1GpZBAyfOVnIJ1usDjqTuDWpxj3LFb8wY?usp=sharing

sayef commented 2 years ago

Hi,

Have not released the latest changes to PyPi yet. Could you please clone the repo and install using setup.py as instructed on the readme?

Let me know if that works. Thanks.

piegu commented 2 years ago

Yes, it works with clone of transformers + fsner (I updated the test notebook at https://colab.research.google.com/drive/1GpZBAyfOVnIJ1usDjqTuDWpxj3LFb8wY?usp=sharing).

Thanks for your work @sayef