gmihaila / ml_things

This is where I put things I find useful that speed up my work with Machine Learning. Ever looked in your old projects to reuse those cool functions you created before? Well, this repo is designed to be a Python Library of functions I created in my previous project that can be reused. I also share some Notebooks Tutorials and Python Code Snippets.
https://gmihaila.github.io
Apache License 2.0
245 stars 61 forks source link

Longformer issue #5

Closed shainaraza closed 2 years ago

shainaraza commented 3 years ago

Hi @gmihaila The fine-tuning notebook works very well, but when I use a sample size of larger than 100 records, the longformer notebook shows strange messages. image

this has nothing to do with the ML-Things library fix text(), even if I dont use it, I still get some error, upong reducing the sample size, it works fine, any suggestions please

gmihaila commented 3 years ago

@shainaraza Make sure that your text data has the right format. In the error it looks like text is a float and it's trying to get the length of a float. I would try to find the exact sample where the error occurs and see how it looks like maybe it's a weird format.

The fix text shouldn't be any issues. It only fixes any weird characters that you might have in the text data.

If you're ok with reducing the sample size I would recommend doing that if it fixes the error.

shainaraza commented 3 years ago

@gmihaila thanks, you give me super advise, really grateful. I was able to resolve, the reason was that I am using more than text as the input, it gave me list of list with texts.append(), I flattened it to one list now its working.

great help from you on this notebook best