JuliaText / TextAnalysis.jl

Julia package for text analysis
Other
374 stars 96 forks source link

Example should use MultivariateStats.jl #72

Closed loleg closed 5 years ago

loleg commented 6 years ago

https://github.com/JuliaStats/MultivariateStats.jl replaces https://github.com/JuliaStats/DimensionalityReduction.jl and the Extended Usage example should be corrected, as it no longer works on newer versions of Julia.

aviks commented 6 years ago

Thanks. Would you do a PR?

loleg commented 6 years ago

@aviks possibly - I started doing some further testing last weekend in here, but don't have a complete example yet - feel free to assign me to it.

aquatiko commented 5 years ago

@aviks I can't seem to find Extended Usage Example which needs fixing. This section was removed from the documentation some time back..

loleg commented 5 years ago

Indeed, I apologize for letting this issue hang for so long. @aviks could you please let us know if we can safely close it at this point?

aviks commented 5 years ago

Ah, that piece got dropped while reorganising the documentation. This is what it looked like. @aquatiko if you want to work on this, can you test the code, ensure it is working, and add it to a new file in the docs/src directory.

# Extended Usage Example

To show you how text analysis might work in practice, we're going to work with
a text corpus composed of political speeches from American presidents given
as part of the State of the Union Address tradition.

    using TextAnalysis, DimensionalityReduction, Clustering

    crps = DirectoryCorpus("sotu")

    standardize!(crps, StringDocument)

    crps = Corpus(crps[1:30])

    remove_case!(crps)
    remove_punctuation!(crps)

    update_lexicon!(crps)
    update_inverse_index!(crps)

    crps["freedom"]

    m = DocumentTermMatrix(crps)

    D = dtm(m, :dense)

    T = tf_idf(D)

    cl = kmeans(T, 5)
loleg commented 5 years ago

Thanks @aviks @aquatiko