DistrictDataLabs / baleen

An automated ingestion service for blogs to construct a corpus for NLP research.
MIT License
86 stars 38 forks source link

commit seed file to /fixtures #50

Closed echolabstech closed 8 years ago

echolabstech commented 8 years ago

I found the seed file, feedly.opml, in /tests/fixtures/

According to the install instructions, we should move it to /fixtures/

bbengfort commented 8 years ago

Not move - but create a fixtures/feedly.ompl that has several RSS seed files for getting started with Baleen in a meaningful way.

bbengfort commented 8 years ago

@will2041 has this one.

On Thursday, June 2, 2016, echolabstech notifications@github.com wrote:

I found the seed file, feedly.opml, in /tests/fixtures/

We should move it to /fixtures/

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bbengfort/baleen/issues/50, or mute the thread https://github.com/notifications/unsubscribe/AAth7jR2eRrO53y9EgJVp7L82YRpryWMks5qHykbgaJpZM4Is4Fj .

Sent from Gmail Mobile

will2041 commented 8 years ago

Just for tracking, I'm planning on restoring the original file that was deleted and also adding a version with only a handful of feeds.

bbengfort commented 8 years ago

Do we need the whole original file or just the small sample one?

On Thursday, June 2, 2016, will2041 notifications@github.com wrote:

Just for tracking, I'm planning on restoring the original file that was deleted https://github.com/DistrictDataLabs/baleen/commit/da54aa8a337b6521cc8804400f4077903d686f35#diff-c66d4178eab3446ab918e24e47edbacc and also adding a version with only a handful of feeds.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bbengfort/baleen/issues/50#issuecomment-223427614, or mute the thread https://github.com/notifications/unsubscribe/AAth7ttJxFLCdAVSMMXPFnqxaEOG0kKiks5qH0p0gaJpZM4Is4Fj .

Sent from Gmail Mobile

will2041 commented 8 years ago

Eh, either/or... My thinking is that it would be nice to have the larger one around just in case someone wants a larger dataset.

I'll make the trimmed down one "feedly.opml" and keep the original big one around as "feedly_large.opml". It's just 18KB.

will2041 commented 8 years ago

Seed files added. Closing.