Open farinamhz opened 1 year ago
Hi, I've taken a look at the repo and paper. I'm a little confused; am I supposed to try to train/test the model on the augmented datasets used with LADy? I assumed our repo would have datasets with the original reviews + backtranslated reviews to run the model with, but I wasn't able to find where that would be, or if I'm approaching this correctly at all.
Hey @arfaamr, We'd like you to first check out the repo and run it on their datasets to figure out how it works. As the datasets are also similar (they are semeval but may be in different formats), it will be easier to integrate their work with LADy. So, after you can successfully run their codeline, we will integrate it with LADy.
We can talk more in our meeting today if it still needs clarification.
[Last week's progress]
Thanks @arfaamr Please update us on your run on toy example. Then connect with @3ripleM to integrate it to LADy pipeline. thanks.
@hosseinfani: I tried using cpu with the full original dataset, and it trains without error, but is slow. I set epochs=2 instead of 100 just to see if it would train/test to completion, and it did. Should I still make a toy dataset and train with the default 100 epochs, or is this enough to evaluate that the code works, and move to integrating with LADy?
@arfaamr
awesome. yes, it's enough. Please post the snapshot of your run here for the record.
you can go ahead with integrating it to lady. please connect with Medi to create a new class definition and etc.
@3ripleM please connect with Arfaa for this.
Run on cpu for 2 epochs
Hi @arfaamr ,
For the integration, we have a file named mdl.py
, which is an abstract class for creating a new baseline.
To add the new baseline to LADy, you need to add a file to the aml
directory, and its name should match the name of the baseline. Additionally, in the main.py, you should add the model name to line 204.
Furthermore, there are more definitions to understand in the abstract class. Please investigate the class model, and if there are any parts you don't understand, feel free to contact me.
We can also schedule a Teams meeting. I believe it would be better to have a meeting once you've had a chance to review the LADy project in order to discuss your tasks.
Thanks @3ripleM.
I have created emc.py
and added the model to main.py
. So far, I have only created a basic init() with naspects, nwords. I'm not sure how to integrate the rest of the needed functions. I would be available to meet anytime today or tomorrow after noon.
This week, started functions in emc.py
, looking at other files in /aml
for reference.
Most of the other files seem to import a library for the baseline, ie import fastext
, bert_e2e_absa
, to use functions like train
, etc. so I tried that. I created an __init__.py
in my baseline's repo, but am not sure exactly how to import it in emc.py
, as the repo name has a dash, which is invalid in import statements.
Additionally, I noticed settings['train']
in params.py
seems to have sections for other baselines. Should I add my baseline's args in there as well?
Also, I am currently working on this on a local branch of LADy. Should I push it so someone can check my progress and see if it looks okay so far, or just wait until I finish the functions?
This week, I worked on preprocess()
in emc.py
, for converting LADy's data to the format used in the EMCGCN code to use for training, etc. in emc.py
.
I found that LADy's raw data is as XML
files. EMC's raw data is stored in JSON
and .vocab
files. The JSON
files seem equivalent to LADy's XML
files, but I am not sure what the .vocab
files are for. They have names such as postag
, deprel
, and are loaded in with pickle.load
, but I was not able to open or print them to see what exactly they contain.
As I understand it, LADy's XML
data is used to create Review objects, which are used for training, etc. EMC's JSON
and .vocab
data is used to create Instance objects, which are similar to Review objects. I am working on converting Review objects to Instance objects, so that when LADy is run, its XML
data will be used to create Reviews, then those can be used to create Instances, which will be used in emc.py
's train()
, etc.
Let me know if this is an incorrect interpretation of how LADy works or how I should approach this.
Almost finished writing wrapper.py using parameters of Review objects to write to json file. Have not tested it yet, only unsure about the meaning of a few keywords that don't directly correspond like "postag", "deprel", "target_tags".
Hi @arfaamr, thank you. Are they variables of the code? Could you give me an example of the code in which they have been used?
@farinamhz: I think I figured out what "target_tags" means now, but I'm still unsure about the others.
For "postag", "deprel", etc, they are written in the input JSON files like this, for each review:
{ ... "postag": ["CC", "DT", "NN", "VBD", "RB", "JJ", "IN", "PRP", "."], ... , "deprel": ["cc", "det", "nsubj", "cop", "advmod", "root", "case", "obl", "punct"]}
And are also loaded in from .vocab files and used in train()
, like this:
l_rpd = 0.01 F.cross_entropy(post_pred.reshape([-1, post_pred.shape[3]]), tags_symmetry_flatten, ignore_index=-1) l_dep = 0.01 F.cross_entropy(deprel_pred.reshape([-1, deprel_pred.shape[3]]), tags_symmetry_flatten, ignore_index=-1) l_psc = 0.01 F.cross_entropy(postag.reshape([-1, postag.shape[3]]), tags_symmetry_flatten, ignore_index=-1) l_tbd = 0.01 F.cross_entropy(synpost.reshape([-1, synpost.shape[3]]), tags_symmetry_flatten, ignore_index=-1)
I did not see them mentioned in the paper, and I couldn't find anything about them online.
@arfaamr,
I haven't checked out the code, but based on what you provided,
In my opinion, "postag" appears to represent part-of-speech tagging (POS tagging).
And "deprel" seems to be helpful for parsing or similar things. It also appears to represent a linguistic annotation, specifically for dependency relations in a structured data format. In dependency grammar, these labels describe the relationships between words in a sentence.
In meeting with Farinam we found that EMC has a function that may be able to generate postag, deprel, etc. However I'm not sure how to use it. it almost looks as though it takes part of the JSON as input to further generate something else instead.
I tried using the nltk library to generate these things instead, and I am able to generate postag, but head and deprel seem more complicated. It might be better to try again with EMC's function instead first, but I need help with that.
There also seem to be other libraries besides nltk that can do this, but I'm not sure of their reliability
Hi @farinamhz:
I have completed as much of the wrapper as I can. I put wrapper.py
, splits.json
, review.pkl
in LADy's src
directory, and it was able to output .jsons
to output/
.
Currently the problems I am still facing are that:
get_aos()
in review.py
does not seem to give any opinion. All opinions are blank, so opinion_tags
in the jsons are incorrect as a result.nltk
's postag()
has a parameter for language, but only supports English and Russian. Is this sufficient, or do we need to find an alternative?stanza
for head
, deprel
results in error:
error: [W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
It may work on someone else's device instead, or we could try a different library.
Without using stanza
, the rest of the code runs without error and produces json files like EMC's.
I have a local branch of LADy that I was working on, but I think I don't have permission to push it. I pushed my wrapper.py
to a branch of my EMC fork instead for you to see it, here. It probably still has more errors but is as close as I can get.
I have upcoming finals this week, so I probably won't be able to work on it more. Sorry for the inconvenience.
Hey @arfaamr, Thank you for the updates. We'll take care of the rest. Good luck with the exams!
@hosseinfani
Meanwhile, as reviews are all in English whether original or backtranslated, I don't think that NLTK would have any problem, @arfaamr.
@farinamhz, OK, that works out then. The original wrapper.py
that you gave me had a lang
parameter in the preprocess()
function, so I wasn't sure whether that meant the function was expected to work with multiple languages.
Hey @arfaamr, I hope you are doing well. As we want to add a new baseline in LADy that is suitable for both aspect term and sentiment extraction, I have chosen Enhanced Multi-Channel Graph Convolutional Network for Aspect Sentiment Triplet Extraction paper accepted in ACL 2022. You can check it out and run it using their repo at this link: https://github.com/CCChenhao997/EMCGCN-ASTE
Let me know if you have any questions or need help on this task.
@hosseinfani