Moeinh77 / Virus-DNA-classification-BERT

Classification of 6 viruses including covid-19 based on their DNA sequences using Transformers
15 stars 6 forks source link

about MAX_LEN #4

Open cxl973 opened 1 year ago

cxl973 commented 1 year ago

Hi @Moeinh77,I noticed that the longest token length for bert is 512, while the length of the training data is much greater than 512. So does it mean that when processing data, it can be directly truncated to 512

Moeinh77 commented 1 year ago

Hi Yes the sequences will be truncated to 512.

On Tue, May 16, 2023, 8:25 p.m. abc-123-ship-cell @.***> wrote:

Hi,I noticed that the longest token length for bert is 512, while the length of the training set DNA is much greater than 512. So does it mean that when processing data, it can be directly truncated to 512

— Reply to this email directly, view it on GitHub https://github.com/Moeinh77/Virus-DNA-classification-BERT/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG5MIHNPR3D6SUAEZTIWKOTXGQZIRANCNFSM6AAAAAAYEN2AKA . You are receiving this because you are subscribed to this thread.Message ID: @.***>