Can you share config.json file for BERT?

ninjakx commented 3 years ago

I am having trouble with training the model only with the config.json part.

omarsou commented 3 years ago

Hello, I noticed that you closed this issue. Does it mean that you have solved it ?

ninjakx commented 3 years ago

Actually No. I got stuck at layoutLM one right now. Can you help? Yesterday only I resolved it but it started appearing again. Same error: https://github.com/huggingface/transformers/issues/5611

Can you help with config part(LayoutLM and bertlarge)?

I also downloaded the https://www.kaggle.com/jpmiller/layoutlm?select=layoutlm-large-uncased and tried model_path = "layoutlm-large-uncased"

model_path = 'bert-large-uncased'
num_labels = len(labels)
config_class, model_class, tokenizer_class = LayoutlmConfig, LayoutlmForTokenClassification, BertTokenizerFast
config = config_class.from_pretrained(model_path, num_labels=num_labels+1)
tokenizer = tokenizer_class.from_pretrained(model_path, do_lower_case=True)
model = model_class.from_pretrained(model_path, from_tf=bool(".ckpt" in model_path), config=config)
model = model.to(device)

max_seq_length = 150
pad_token_label_id = CrossEntropyLoss().ignore_index
train_dataset = CordDataset(train, tokenizer, labels, pad_token_label_id)
validation_dataset = CordDataset(val, tokenizer, labels, pad_token_label_id)
model_type = 'layoutlm'

This is what I am loading. Sorry for asking this silly doubt I just started with NLP.

Here is the code: https://www.kaggle.com/ninjakx01/notebook636ede94ea

I ain't able to run your code. Can you help? I am getting so many error. few elements gives me errors like:

CUDA error: device-side assert triggered

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) <-------- this can be solved by upgrading pytorch

IndexError: index out of range in self

KeyError: 24 in pred_list[i].append(label......) of result function

few of the data samples such as 79th element of train loader produces these errors.

omarsou commented 3 years ago

Hello, For errors involving gpu (CUDA), can you run the code on CPU ? This latter can give more explicit message errors.

Concerning the model you are using, I took the pretrained model from the official repository : https://github.com/microsoft/unilm/tree/master/layoutlm => you can find the links to the models (Onedrive link & GoogleDrive link) in the section "Pre-trained Model" of their "readme.txt". Can you try to use this one ? It contains everything you need (the config json file, tokenizer file, the pretrained model ..). (the direct google drive link : https://drive.google.com/drive/folders/1tatUuWVuNUxsP02smZCbB5NspyGo7g2g) It's my fault, I should have specified where to find the necessary files to run the notebook.

ninjakx commented 3 years ago

Let me try I try to upload the same layoutLM-large-uncased via kaggle but still got the same error but this time KeyError: 24 in pred_list[i].append(label......) of result function

ninjakx commented 3 years ago

I am still getting the same error IndexError: index out of range in self

Do you have the working directory of the code in gdrive? If you can share the working demo it will help. I don't know I am getting these errors.

omarsou commented 3 years ago

I tried to rerun the notebook and I get the same errors. First of all, try to use the old version of transformers (i.e !pip install transformers==2.9.0). Secondly, try to modify the code as below :

model_path = 'bert-large-uncased' num_labels = len(labels) config_class, model_class, tokenizer_class = LayoutlmConfig, LayoutlmForTokenClassification, BertTokenizerFast config = config_class.from_pretrained(model_path, num_labels=num_labels) # remove the "+1" tokenizer = tokenizer_class.from_pretrained(model_path, do_lower_case=True) model = model_class.from_pretrained(model_path, from_tf=bool(".ckpt" in model_path), config=config) model = model.to(device)

Sorry, but it's my fault, I didn't push the last version of the notebook.

Try to do these modifications, thank you

ninjakx commented 3 years ago

Still getting same error:

-> 1852     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1853 
   1854 

IndexError: index out of range in self

:'(

https://colab.research.google.com/drive/10am_HcutuiqMZ5xsAO100EWIvdCO5h2K?usp=sharing

Is that working for you?

omarsou commented 3 years ago

Yes I can see the error.

As you can see in the image above, there are some coordinates from the dataset that are negatives, and thus when you try to do the imbedding, you get the error "Index out of range in self". The thing is that the index where this negative coordinates appear is not the same as in your dataset. (For me it's the index : 669 , and for you it's 770 in train) You should then modify two cells as shown in the image below:

ninjakx commented 3 years ago

Thanks I didn't think that index could be different. The problem was there only thanks for your time :)

omarsou / layoutlm_CORD

Can you share config.json file for BERT? #1