CoderPat / structured-neural-summarization

A repository with the code for the paper with the same title
MIT License
74 stars 26 forks source link

Is the file incomplete? #5

Closed shellycsy closed 5 years ago

shellycsy commented 5 years ago

First,Thanks for sharing!

When I'm working with data, perform this step:python convert2graph.py /path/to/output/xml /path/to/summaries /path/to/output,

The following problems arise:

Traceback (most recent call last): File "convert2graph.py", line 23, in from parsers.naturallanguage.gigaword.loadgigacorpus import parse_sample ImportError: No module named 'parsers'

I can't find "gigaword" under the "naturallanguage", so ,is the file incomplete?

CoderPat commented 5 years ago

Hi, thank you for the interest. In porting/cleaning the code, I forgot to include code from scripts for another early dataset tested on. It should be fixed now

shellycsy commented 5 years ago

Hi, thank you for the interest. In porting/cleaning the code, I forgot to include code from scripts for another early dataset tested on. It should be fixed now

When I run this new file,The result is nothing, “ python convert2graph.py --debug 123.xml summaries output Usage: ”

shellycsy commented 5 years ago

Hi, thank you for the interest. In porting/cleaning the code, I forgot to include code from scripts for another early dataset tested on. It should be fixed now

I see this code in convert2graph.py: def parse_cnndm_file(filename: str, write_sample_callback: Callable, summaries_folder: str) -> None:

def process_sample(location, sample):

    assert location[0][0] == 'root'

    assert location[1][0] == 'document'

    provenance = sample['docId']

    def get_summary_tokens(sample_data)-> List[str]:

        assert provenance.endswith('.article')

        with open(os.path.join(summaries_folder, provenance[:-len('.article')] + '.abstr')) as f:

            summary = f.read()

            return [t.strip() for t in summary.split() if len(t.strip()) > 0]

    parsed = parse_sample(sample, provenance, headline_getter=get_summary_tokens)

    if parsed is not None:

        write_sample_callback(parsed)

    return True

load_xml(filename, depth=2, func=process_sample)

I guess there may be a problem in summaries_folder,but this file is not exist,so i do not know what should i do.

shandou commented 5 years ago

I have the same issue--the submodule parsers.naturallanguage.gigaword does not exist in the current version of the code. Would you mind uploading it? Thanks!