dhmit / gender_analysis

A toolkit for analyzing gendered language across sets of documents
BSD 3-Clause "New" or "Revised" License
11 stars 5 forks source link

Corpus __init__ -- take dataset locations as args, not a corpus name #16

Closed ryaanahmed closed 5 years ago

ryaanahmed commented 5 years ago

@samimak37 -- @sophiazhi, @meesuekim, and I were brainstorming about what the best way into the module would be, and we realized that we shouldn't always require the user to provide a metadata csv file. There a lot of our analysis tools that you can do with a corpus with no metadata.

So, we need a Corpus.__init__() that takes as args

If the caller does not supply the location of a metadata CSV file, we'll construct a metadata dict ourselves, with only a 'filename' key.

ryaanahmed commented 5 years ago

Maybe take 'name' as an optional arg...

ryaanahmed commented 5 years ago

done! wahoo.