Historically, I never had a good use case for doing analysis with readily available text. Having stumbled on some old work, I think that Data.gov presents itself as a prime candidate for text summarization. I don't know if it should live in this repo or some other repo. But maybe as an example in this repo would work best.
From https://github.com/GSA/data.gov/issues/4068, it is evident that some pre-processing will be necessary. However, I feel like the data shouldn't be manipulated too much because there are lots of different vocabularies, so depending on initial results, further processing might be possible.
The point is not to just get a bunch of keywords. The point is to tell the story of America. While it's evident that most of the data is not updated super frequently, it might be possible to highlight different aspects of the story based on different time periods.
This will hopefully feed into my plans to road trip across America and connect people and data,
Historically, I never had a good use case for doing analysis with readily available text. Having stumbled on some old work, I think that Data.gov presents itself as a prime candidate for text summarization. I don't know if it should live in this repo or some other repo. But maybe as an
example
in this repo would work best.Notes:
The point is not to just get a bunch of keywords. The point is to tell the story of America. While it's evident that most of the data is not updated super frequently, it might be possible to highlight different aspects of the story based on different time periods.
This will hopefully feed into my plans to road trip across America and connect people and data,