cooper12121 / DIE-EC

4 stars 0 forks source link

something about src/preprocess_edu_embed.py #3

Open eeehco opened 1 month ago

eeehco commented 1 month ago

I encountered the following error while running the code using 'python src/process edu embed. py':

image

eeehco commented 1 month ago

I encountered the following error while running the code using 'python src/process edu embed. py':

image

After investigation, I found that the following line of code caused edu_list to be NONE. Can you tell me what's going on and how I can solve it: image

era211 commented 1 month ago

I get the same error when running the "python src/process edu embed. py" .

eeehco commented 1 month ago

I get the same error when running the "python src/process edu embed. py" .

Do you know how to solve it?

era211 commented 1 month ago

I get the same error when running the "python src/process edu embed. py" .运行“python src/process edu embed”时,我收到相同的错误。py“。

Do you know how to solve it?你知道怎么解决吗?

I haven't solved it yet, can you let me know if you do?

cooper12121 commented 1 month ago

for the edu_list, at first, you need to construct edu by rst model, and also the lexical chain nodes.

eeehco commented 1 month ago

for the edu_list, at first, you need to construct edu by rst model, and also the lexical chain nodes.

Thanks for your reply. Can you tell me specifically how to construct edu?

era211 commented 1 month ago

for the edu_list, at first, you need to construct edu by rst model, and also the lexical chain nodes.

Thanks for your reply. Can you tell me specifically how to construct edu?Thank you very much.

cooper12121 commented 1 month ago

you can find the method i used to construct rst in the paper, please note there are several different ways to construct rst. for performance purposes, you might want to try using the latest models, which can be find in the references. Of course, the code needed to construct rst can also be found in "sota_end2end_parser_1" directory. Hope this is helpful.

era211 commented 1 month ago

you can find the method i used to construct rst in the paper, please note there are several different ways to construct rst. for performance purposes, you might want to try using the latest models, which can be find in the references. Of course, the code needed to construct rst can also be found in "sota_end2end_parser_1" directory. Hope this is helpful.

Thank you for your reply. Can I directly follow the "README.md" file in the "sota_end2end_parser_1" directory? image

cooper12121 commented 1 month ago

of course, they also open a new rst model, you can try that.

eeehco commented 1 month ago

of course, they also open a new rst model, you can try that.

I have tried. When running 'pipeline. py', the following error was encountered. Can you tell me what EN_200.model is? Or where can I find it?

image

cooper12121 commented 1 month ago

Actually you can find this at the reference paper of rst.

eeehco commented 1 month ago

thank you! I found it here in https://github.com/NLP-Discourse-SoochowU/sota_end2end_parser

eeehco commented 1 month ago

I ran pipeline.py according to the "readme" in "sota_end2end_marser_1", but when I run "preprocess_edu_embed.py" again, it's still the same as the previous problem.

era211 commented 1 month ago

I ran pipeline.py according to the "readme" in "sota_end2end_marser_1", but when I run "preprocess_edu_embed.py" again, it's still the same as the previous problem.

Hello, have you solved this problem?

eeehco commented 1 month ago

I ran pipeline.py according to the "readme" in "sota_end2end_marser_1", but when I run "preprocess_edu_embed.py" again, it's still the same as the previous problem.

Hello, have you solved this problem?

No, after running 'pipeline. py' and then running 'preprocess edu embed. py', the error is still the same as before.

era211 commented 1 month ago

I ran pipeline.py according to the "readme" in "sota_end2end_marser_1", but when I run "preprocess_edu_embed.py" again, it's still the same as the previous problem.

Hello, have you solved this problem?

No, after running 'pipeline. py' and then running 'preprocess edu embed. py', the error is still the same as before.

In order to continue to run the code does the sentence in the data set run 'pipeline. py', and then re-label the generated EDU itself into the data set, and how much gpu memory did you use to run' pipeline. py', do you remember? Would you mind communicating with me by email?

era211 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

eeehco commented 1 month ago

I ran pipeline.py according to the "readme" in "sota_end2end_marser_1", but when I run "preprocess_edu_embed.py" again, it's still the same as the previous problem.

Hello, have you solved this problem?

No, after running 'pipeline. py' and then running 'preprocess edu embed. py', the error is still the same as before.

In order to continue to run the code does the sentence in the data set run 'pipeline. py', and then re-label the generated EDU itself into the data set, and how much gpu memory did you use to run' pipeline. py', do you remember? Would you mind communicating with me by email?

ok,my email address is:2819233822@qq.com

cooper12121 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

eeehco commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

How to make sure I get the not-none edu_list from code of constructing rs?

cooper12121 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

eeehco commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid. Are you referring to this “edu. txt” file?

image

era211 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

cooper12121 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid. Are you referring to this “edu. txt” file?

image

No, at first you should read the code structure carefully, then you will find you can find the related code in "RST_parse" directory.

cooper12121 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

yeah, you should get the edu_list at first, and then you add the edu_list to input file , after that, you can process that in src directory.

era211 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

yeah, you should get the edu_list at first, and then you add the edu_list to input file , after that, you can process that in src directory.

Hello, when running "pipeline.py", "do_seg()" will get edu of the sentence, in "sota_end2end_parser", does it only need edu? Will "do_parse()" this generated file be used in later runs? If I don't use it, I don't run the "do_parse()" function because I have a problem here: TypeError: start() takes 1 positional argument but 2 were given. I don't know how to fix this

era211 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

yeah, you should get the edu_list at first, and then you add the edu_list to input file , after that, you can process that in src directory.

Excuse me, I have another question, when generating edu from "Train_Event_gold_mentions.json" file, you need to extract all the sentences in the json file and put them in the "raw.txt" file, and then run the "pipeline.py" file. When the run is done, you get the "edu.txt" file, and finally you want to re-create the edu from each sentence in "edu.txt" and add the "edu_list" field to the "Train_Event_gold_mentions.json" file? I wonder if the process I described is clear? If so, it's a lot of work. Is there a good way to do that?

cooper12121 commented 1 month ago

you don't need to run pipline.py, i have mentioned before that you need to focus on the "RST_parse" director, if you read the code carefully, you will find at wec_process.py, there is a clear process of handing data.

image image

and you add a key-value to the original wec datasets. then you can just run src/....py code.

cooper12121 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

yeah, you should get the edu_list at first, and then you add the edu_list to input file , after that, you can process that in src directory.

Hello, when running "pipeline.py", "do_seg()" will get edu of the sentence, in "sota_end2end_parser", does it only need edu? Will "do_parse()" this generated file be used in later runs? If I don't use it, I don't run the "do_parse()" function because I have a problem here: TypeError: start() takes 1 positional argument but 2 were given. I don't know how to fix this

For this kind of eorros, at least you should debug where it come from so I will help you with that.

cooper12121 commented 1 month ago

Please note,i have been very busy recently, If you meet some code errors, please debug first and give me more details about this, I will work with you together to fix them.

era211 commented 1 month ago

Hello author, when running "src/preprocess_edu_embed.py" we have encountered the edu_list type is none problem, currently we can not solve this problem, would you like to update "readme.md", add how to generate and use edu steps? We would appreciate it if you could

the reason for this error is obvious, you get a none edu_list, which corresponds to the input file, try to make sure you get the not-none edu_list from code of constructing rst.

for these errors, the problem must comes from the process of constructing rst, you should read the reference paper's code repository carefully and see if you result of edu_list is valid.

Hello author, in the process of building rst generated "edu.txt" this file, we will be generated in this file to re-add the content to the json file in the data set? And then reload to edu_list when we build the MentionData class?

yeah, you should get the edu_list at first, and then you add the edu_list to input file , after that, you can process that in src directory.

Hello, when running "pipeline.py", "do_seg()" will get edu of the sentence, in "sota_end2end_parser", does it only need edu? Will "do_parse()" this generated file be used in later runs? If I don't use it, I don't run the "do_parse()" function because I have a problem here: TypeError: start() takes 1 positional argument but 2 were given. I don't know how to fix this

For this kind of eorros, at least you should debug where it come from so I will help you with that.

I solved this problem, thank you

era211 commented 1 month ago

you don't need to run pipline.py, i have mentioned before that you need to focus on the "RST_parse" director, if you read the code carefully, you will find at wec_process.py, there is a clear process of handing data. image image and you add a key-value to the original wec datasets. then you can just run src/....py code.

Thanks for your reply, we will try to run wec_process.py, have a nice day!

era211 commented 1 month ago

Please note,i have been very busy recently, If you meet some code errors, please debug first and give me more details about this, I will work with you together to fix them.

I thank you very much and I will keep trying, good luck with your work and I hope you have a great day!

era211 commented 3 weeks ago

Hello author, sorry to bother you. In the output results of code operation, we found that the value of acc remained unchanged in the previous epochs, which were all 0.9933043122. precision, recall and F1 values are all 0, we look at the function that returns these values, and then print out the values of the confusion matrix. In the first few epochs, we find that tp values are all 0, so precision, recall and F1 values are 0, so is this correct? Or are there some things we didn't set up correctly when we ran the code?

era211 commented 3 weeks ago

Hello author, sorry to bother you. In the output results of code operation, we found that the value of acc remained unchanged in the previous epochs, which were all 0.9933043122. precision, recall and F1 values are all 0, we look at the function that returns these values, and then print out the values of the confusion matrix. In the first few epochs, we find that tp values are all 0, so precision, recall and F1 values are 0, so is this correct? Or are there some things we didn't set up correctly when we ran the code? image

cooper12121 commented 1 week ago

Hello author, sorry to bother you. In the output results of code operation, we found that the value of acc remained unchanged in the previous epochs, which were all 0.9933043122. precision, recall and F1 values are all 0, we look at the function that returns these values, and then print out the values of the confusion matrix. In the first few epochs, we find that tp values are all 0, so precision, recall and F1 values are 0, so is this correct? Or are there some things we didn't set up correctly when we ran the code? image

I'm sorry for the delayed response, i've just returned from attending a conference abroad. I'll work with you to fix all the possible errors from now on.

cooper12121 commented 1 week ago

Hello author, sorry to bother you. In the output results of code operation, we found that the value of acc remained unchanged in the previous epochs, which were all 0.9933043122. precision, recall and F1 values are all 0, we look at the function that returns these values, and then print out the values of the confusion matrix. In the first few epochs, we find that tp values are all 0, so precision, recall and F1 values are 0, so is this correct? Or are there some things we didn't set up correctly when we ran the code? image

this must be due to an issue with the model's output, that is, all the prediction of model output is 1(negative). To be more specific, you can first use the hyperparameters I used to see if this problems still exist. In my own experiments. no this problems. according to the distribution of WEC dataset, which the ratio of positive to negative samples is 1:10, the model shouldn't predict 0 always.