lapplislazuli / Hopinosis

Opinosis Implementation in Haskell
MIT License
0 stars 0 forks source link
nlp nlp-library opinosis opinosis-summarizer text-summarization

Hopinosis

Build License: MIT

This repository contains the library "Hopinosis" - a Haskell implementation of Opinosis.

Opinosis builds a graph from a given text, where each node is a word in the text. For each node the occurrence is noted and the following word is connected via node. Based on this the most redundant paths can be found given that redundancy is either defined by "word occurrence" (node magnitude) or "succession" (edge magnitude).

To yield human readable sentences, only those paths are seen as valid which:

  1. Start with a node marked as "start"
  2. End with a node marked as "end"
  3. Are acyclic

Changes to proposal:

I have put some samples up for you, so you can see what you might expect.

You can find samples in this repository, with rough estimates of time on a common office-laptop (generic thinkpad).

Build, Run and Test

Interactive

To run the code, go to /Src and start your GHCI.

You@GHCI> :load Hopinosis.hs

This will let you use the lib. (don't forget to run cabal configure once! You will need libraries.)

Build with Cabal

For more Information on the setup, see the cabal file.

$> cabal new-build --enable tests --enable-documentation
$> cabal new-test
$> cabal new-install

For the installation you need to have symlinks configured for your cabal. After that, you can use the library from anywhere on your machine.

If you're using windows, I highly recommend to change that.

Run when installed

To run after the new-install you can simply go:

Hopinosis -f ./Files/darkwing.txt -n 2 -d 0.51 -t 0.51 -v --sim jaccard

This will run the application. For an overview of the parameters, the Program.hs is considerably well documented.

Run with Cabal

To run without installation, you can do:

cabal run Hopinosis -f ./Files/darkwing.txt -n 2 -d 0.51 -t 0.51 -v --sim jaccard

This also accepts RTS-Parameters such as +RTS -N2

Documentation

This seems to be the right way to build documentation from source:

$> cabal act-as-setup -- haddock --builddir=dist-newstyle/build/x86_64-windows/ghc-8.6.3/Hopinosis-M.m.f --internal

Which will create a lot of items for you. index.html is the starting point you are looking for.

Note: You may have to run an build --enable-documentation beforehand.

Contribution

You're contribution is welcome! There are several topics you can help with:

If you want to help me via code, please refer to the Contribution Guidelines.

Additional Notes

Here are some thoughts on the project which may come across your mind: