List of corpora - Githubissues

Enough AMR corpora are becoming available even for English that it is hard to keep track of all of them, and it is not always easy to find them based on the publication entries in our bibliography.

It may be time for a separate page listing corpora. This could start as a Google Sheet, with columns for language, dataset name, dialect (original, Dialogue-AMR, etc.), size of annotated data (tokens, AMRs), URL, and a reference to the publication entry.

Basically the principle would be: the Bibliography page lists things to read/cite, and the Corpus List page lists corpora to download.

nert-nlp / AMR-Bibliography

List of corpora #255