yangheng95 / PyABSA

Sentiment Analysis, Text Classification, Text Augmentation, Text Adversarial defense, etc.;
https://pyabsa.readthedocs.io
MIT License
909 stars 153 forks source link

Shuffling at inference time #108

Closed lucasrafaelc closed 2 years ago

lucasrafaelc commented 2 years ago

Hey @yangheng95 ,

I'm using your framework to train and make inference on ATE task. I'm having some issues with inference because PyABSA is shuffling the inference dataset. It is happening because the DataLoader was instantiated without the property shuffe = False, in AspectExtractor class. Is there a reason for that? Can it be changed or avoided in some way?

Thanks in advance for your help! Lucas

yangheng95 commented 2 years ago

I havent notice similar behaviour. Can you refer to the relevant code (line) which shuffles the inference dataset?

lucasrafaelc commented 2 years ago

In aspect_extractor.py, function _infer():

Line 339: self.eval_dataloader = DataLoader(eval_data, sampler=eval_sampler, batch_size=self.opt.eval_batch_size)

It's missing "shuffle=False" in DataLoader constructor.

yangheng95 commented 2 years ago

The "sampler" param is exclusive with "shuffle", and the sampler should not shuffle the inference dataset.

lucasrafaelc commented 2 years ago

Here is my code:

atepc_result = aspect_extractor.extract_aspect(inference_source=PATH_TO_INFER_DATASET,
save_result=True, print_result=False, # print the result pred_sentiment=True, # Predict the sentiment of extracted aspect terms ) Sentences in the inference dataset are A-B-C-D. In the result file, sentences are like C-B-D-A. How can I avoid this?

yangheng95 commented 2 years ago

@XuMayi please check if the problem comes from output merging function in aspect_extractor.py.

yangheng95 commented 2 years ago

Did you find the problem? @lucasrafaelc @XuMayi