Yale-LILY / SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo
https://arxiv.org/abs/2108.12738
Apache License 2.0
264 stars 30 forks source link

HMNetModel cannot be load #75

Open ismu opened 3 years ago

ismu commented 3 years ago

Hi :)

I have problem with HMNetModel. I can't load it on yours Colab notebook. I prepare code for this task:

from model import HMNetModel

hmn = HMNetModel()

I get error:

usage: ipykernel_launcher.py [-h] [--command COMMAND] [--conf_file CONF_FILE]
                             [--PYLEARN_MODEL PYLEARN_MODEL]
                             [--master_port MASTER_PORT] [--cluster CLUSTER]
                             [--dist_init_path DIST_INIT_PATH] [--fp16]
                             [--fp16_opt_level FP16_OPT_LEVEL] [--no_cuda]
                             [--config_overrides CONFIG_OVERRIDES]
ipykernel_launcher.py: error: unrecognized arguments: -f /root/.local/share/jupyter/runtime/kernel-d260cea4-6d66-4567-bbda-0e4a0fefba13.json
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

Do you know what's happen? Thanks in advance for your help :)

ismu commented 3 years ago

I investigate a bit this problem. It doesn't seem to work just for the colab. When I install SummerTime locally, model is loaded.

niansong1996 commented 3 years ago

Can you try now since #79 is just merged to solve some issues for HMNet?

ismu commented 3 years ago

Yes I test your code and it seems all work from running point of view. One problem is that, the HMNetModel isn't downloading automatically, I had to do it manually - related with #84? The second problem is I think poor quality of the results from the model. I don't know it is problem with model or with preprocessing of input to model. In official publication the example looks better. I try investigate this a bit :)

EDIT: I see you change model in yours pull request. What is the difference between these models? The AMI finetuned model seems to work better.

niansong1996 commented 3 years ago

84 is more related to the dataset caching, not models per se. For HMNet, after #79, it should be downloading automatically for the first time, if it hasn't been cached before.

I believe the AMI model is fine-tuned on AMI dialogue data and the new one is pre-trained on a "dialogue-ized" CNN/DM data. I found the latter one to be working better in my case, hence the change. But it should be easy enough to provide options to the users, I'll be on that soon.

ismu commented 3 years ago

Yes, I think the option to choose a model would be good :)