Can you please tell how to prepare the environment?

skybert623 commented 1 year ago

I get into trouble when running your code. It seems that the dataset and the word embedding is not given. Can you tell how to prepare them such that I can run your code directly?

yuelinan commented 1 year ago

You can find the dataset and word embedding here: http://people.csail.mit.edu/taolei/beer/

skybert623 commented 1 year ago

Thank you very much, but in which directory should I put the word embedding?

yuelinan commented 1 year ago

Hi, you can put the word embedding anywhere, and then set the directory in the code:

parser.add_argument('--embeddings', type=str,default="/data/lnyue/mi_nlp/beer/review+wiki.filtered.200.txt.gz", help="path to external embeddings")

in DARE/latent_rationale/beer/util.py

or

python -m latent_rationale.beer.dare \
    --model latent \
    --aspect 0 \
    --epochs 50 \
    --lr 0.00012 \
    --upper_bound 0.01 \
    --batch_size 200 \
    --embeddings your path \
    --train_path ./beer/reviews.aspect0.train.txt.gz \
    --dev_path ./beer/reviews.aspect0.heldout.txt.gz \
    --test_path ./beer/annotations.json \
    --scheduler exponential \
    --save_path ./dare_a0 \
    --dependent-z \
    --selection 0.13 --lasso 0.02

skybert623 commented 1 year ago

Thank you very much, I now run it successfully! It seems that the beer dataset is for regression. If I want to run the classification task, what should I do?

yuelinan commented 1 year ago

You can replace the codes

 self.criterion = nn.MSELoss(reduction='none')

in Line 84 of DARE/latent_rationale/beer/models/latent.py as

 self.criterion = nn.CrossEntropyLoss()

skybert623 commented 1 year ago

I‘m so sorry to disturb you again. But I get such an error when I replace mse with cross-entropy: dare/latent_rationale/beer/models/latent.py", line 127, in get_loss loss_vec = loss_mat.mean(1) # [B] IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

skybert623 commented 1 year ago

I'm still using the beer dataset but binarize the score to 0 and 1 with if scores[0]>= 0.6: data_1.append((tokens,1)) elif scores[0]<=0.4: data_0.append((tokens,0)). It still works when I use mse but raises the above error when I use cross entropy without other adjustments

yuelinan commented 1 year ago

Hi, this error is due to the difference between mseloss and cross_entropy_loss. You can change the code:

loss_vec = loss_mat.mean(1) 
mse = loss_vec.mean()

as

loss_vec = loss_mat  
mse = loss_vec

skybert623 commented 1 year ago

I am sorry, but when I modify it to do the classification task, it seems to work very badly. Could you please release the codes that are suitable for classification, just like your latest paper Inter_RAT?

yuelinan commented 1 year ago

Hi, I have uploaded latent_class.py in DARE/latent_rationale/beer/models/, which may help you.

skybert623 commented 1 year ago

Sorry, I still cannot get the right result when I modify it to do the classification task. Could please release the codes and hyperparameters of the movie reviews (which is a binary classification task) dataset?

yuelinan commented 1 year ago

Hi, I have uploaded codes and hyperparameters of the movie reviews, which may help you.

skybert623 commented 1 year ago

Thank you very much, I'll have a try!

yuelinan / DARE

Can you please tell how to prepare the environment? #1