Closed Ayushk4 closed 5 years ago
I am also replacing the deprecated functions in the documentation. So, far I have finished documents.md
.
author
maybe?
On Sun, Mar 17, 2019 at 9:06 AM Ayush Kaushal notifications@github.com wrote:
@aviks https://github.com/aviks I was going through the docs. I found that names https://github.com/JuliaText/TextAnalysis.jl/blob/b3c467cebb776c6a7e686cfe2b6c9a20c2ca2205/src/metadata.jl#L32 (used for setting metadata - author name for entire corpus of docs), conflicts with the base refer (v0.7) https://docs.julialang.org/en/v0.7/base/base/#Base.names and (v1) https://docs.julialang.org/en/v1/base/base/#Base.names. What do you suggest about this? Should I open an issue?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaText/TextAnalysis.jl/pull/134#issuecomment-473643205, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXIJuB1qmtFTXVLGmXZV1DwMdYi5POJks5vXgWRgaJpZM4b3U1e .
A correction - Author/authors is already used for document's author, should I change the metadata containing the document name to documentName
. Also, I am sending a separate PR for this.
The following are the metadata - names
-> setting the document's name and authors
for setting the author's name.
This is also missing from the documentation for accessing the corpus metadata, I am adding it.
Documenter.jl
seems to be at this point a better choice since all code snippets are (re-)evaluated at branch modifications and reflect changes in the code. Is there any reason for not using it ?
A more flexible approach to the metadata field names is to allow a metadata type hierarchy (i.e. define an AbstractMetadata
type) https://zgornel.github.io/StringAnalysis.jl/dev/doc_extensions/
I don't see any downside of using Documenter.jl
features of @repl
and doctests. Maybe this can be reflected across the documentation of JuliaText Repositories.
A lot of things has to be ported from StringAnalysis.jl. I came across this issue - https://github.com/zgornel/StringAnalysis.jl/issues/2#issuecomment-451190789.
I think that maybe we can start by opening an issue of what all major changes are to be done for TextAnalysis.jl, taking into consideration the things mentioned in the conversation above. Thank you, for your review on the situation.
@aviks Can you please suggest your views on this, so that I may proceed accordingly.
Thanks Ayush, this looks good. Let me know once this is ready to merge. I'm happy to merge things over from StringAnalysis if Corneliu is ok with that.
@aviks I'm perfectly fine with this; feel free to port to TextAnalysis
whatever you see fit
Separate one please
On Mon, 18 Mar 2019, 18:48 Ayush Kaushal, notifications@github.com wrote:
@Ayushk4 commented on this pull request.
In docs/src/corpus.md https://github.com/JuliaText/TextAnalysis.jl/pull/134#discussion_r266590785 :
@@ -83,3 +139,56 @@ corpus. The easiest way to do this is to convert a
Corpus
object into aDataFrame
:convert(DataFrame, crps)
+ +## Corpus Metadata + +You can also retrieve the metadata for every document in a
Corpus
at once: + +languages()
: What language is the document in? Defaults toLanguages.English()
, a Language instance defined by the Languages package. +names()
: What is the name of the document? Defaults to"Unnamed Document"
.Do you prefer that I send another PR making changes, name -> title & name! -> title! & names -> titles & names! to titles! (across the entire codebase, including tests, docs) ? Or include in this one itself?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaText/TextAnalysis.jl/pull/134#discussion_r266590785, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXIJswvIK-wRXsrqwkjVruj7A05Stdaks5vX9-NgaJpZM4b3U1e .
@aviks I am almost done with the PR. All that remains to be done, is #135 being merged so that I can remove the merge conflicts that may arise.
Also, lsa()
is not working. I will get more details about it and file a detailed issue.
@aviks Do you think, BM25 and Co-occurrence matrix will be a good addition to TextAnalysis.jl?
@aviks Do you think, BM25 and Co-occurrence matrix will be a good addition to TextAnalysis.jl?
Yes please!
This PR is aimed at improving the docs in the following ways -
Also testing for the possible errors in the package.
Progress