format of data - Githubissues

shainaraza commented 4 years ago

can you please tell the format of data, e.g, the coLa one, is it ID, label, text? I believe it is source of sentence | acceptability judgment label| acceptability judgment 2 | original sentence

Louis-udm commented 4 years ago

The dataset structure is simple. We only need the text and the label, and every line is one document. for Cola, text=df[3] and label=df[1]. You can also organize your documents like this.

ghost commented 4 years ago

Given that we have a multi-label dataset, can you please specify how to structure the input dataset?

Louis-udm commented 4 years ago

For the multi-label, you can just refer to MNIST, using labels like 0,1,2,3,4...

ghost commented 4 years ago

For the multi-label, you can just refer to MNIST, using labels like 0,1,2,3,4...

But this a multi-class, I mean the multi-label which text is assigned one or more labels. For example: labels Text (0,2,4) hello world (0,1) this is another example

Louis-udm commented 4 years ago

For the multi-label, you can just refer to MNIST, using labels like 0,1,2,3,4...

But this a multi-class, I mean the multi-label which text is assigned one or more labels. For example: labels Text (0,2,4) hello world (0,1) this is another example

Got it! Then you’ll need modify a little the prepare_ file code, as well as the loss function, You can refer to other multi-label git repo, I think the modification is easy when you understand the principle. I believe the model is the same, and building graph do not need the label.

shainaraza commented 4 years ago

In yours program the column 3 is empty (left by blank): if I want to fill it , would it work? Also I have the same question as above, for multi-labels

Column 1: the code representing the source of the sentence. Column 2: the acceptability judgment label (0=unacceptable, 1=acceptable). Column 3: the acceptability judgment as originally notated by the author. Column 4: the sentence.

shainaraza commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change

elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}

ray184550061 commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change
elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}

could you tell me Is it enough to modify these?

shainaraza commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change
elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}
could you tell me Is it enough to modify these

I think, there is more changes required during the score calculation, I made this changes, the final scores went poorer, means the change need to be made when logits are calculated, any thoughts?

LeohAYU commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change
elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}
could you tell me Is it enough to modify these
I think, there is more changes required during the score calculation, I made this changes, the final scores went poorer, means the change need to be made when logits are calculated, any thoughts?

I only adjusted this and ran a 5 classification. The accuracy rate is only 20%. Would you like to ask me where I need to modify it? Thank you very much!

shainaraza commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change
elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}
could you tell me Is it enough to modify these
I think, there is more changes required during the score calculation, I made this changes, the final scores went poorer, means the change need to be made when logits are calculated, any thoughts?
I only adjusted this and ran a 5 classification. The accuracy rate is only 20%. Would you like to ask me where I need to modify it? Thank you very much!

yes same happened with me that score went poor, I didnt try for multi-class later, but i believe change should be made inside https://github.com/Louis-udm/VGCN-BERT/blob/master/model_vgcn_bert.py , during score calculation. if you do so , let me know. I also think how can one change the model instead of BERT, use some other, any thoughts

LeohAYU commented 4 years ago

I made change in code in prepare_data, is it enough, I can still get the results with this new change
elif cfg_ds=='mydata:
    label2idx = {'0':0, '1':1, '2':2}
    idx2label = {0:'0', 1:'1', '2':2}
could you tell me Is it enough to modify these
I think, there is more changes required during the score calculation, I made this changes, the final scores went poorer, means the change need to be made when logits are calculated, any thoughts?
I only adjusted this and ran a 5 classification. The accuracy rate is only 20%. Would you like to ask me where I need to modify it? Thank you very much!
yes same happened with me that score went poor, I didnt try for multi-class later, but i believe change should be made inside https://github.com/Louis-udm/VGCN-BERT/blob/master/model_vgcn_bert.py , during score calculation. if you do so , let me know. I also think how can one change the model instead of BERT, use some other, any thoughts

I am also doing related research recently, so we can keep in touch

Louis-udm / VGCN-BERT

format of data #7