-
Hi
I just went over your document clustering tutorial and it is really amazing ! great work!
I am trying to conduct a clustering of e-mails, so I have been altering the code a bit to fit my purpose.
…
-
Submitting Author: Raktim Mukhopadhyay (@rmj3197)
All current maintainers: @giovsaraceno
Package Name: QuadratiK
One-Line Description of Package: QuadratiK includes test for multivariate normality,…
-
Currently we don't pass a `task_type` - the API docs at https://docs.nomic.ai/reference/endpoints/nomic-embed-text say this:
> The task your embeddings should be specialized for: `search_query`, `s…
-
Update the following URL to point to the GitHub repository of
the package you wish to submit to _Bioconductor_
- Repository: https://github.com/Lu-Group-UKHD/SmartPhos
Confirm the following by …
-
Update/Fix and test .dot file generation for inheritance and clustering graphing. Make commandline options visible and document!
atiti updated
11 years ago
-
Since we'll be revamping the clustering infrastructure, it ought to be possible to partition documents into different datacenters based on some metric.
-
- Make sure weak match works for character offsets.
- Document-level and/or entity-level macro-averaging for NIL clustering.
- Switch off type matching
-
Explore the following issues. Either Daniel/Mandresy or ICCS will do the following:
Optimise training data / ML bias (or better RERUN time)
- Remove LAI and NPP from predictors (from pre-run)
- If …
ma595 updated
2 weeks ago
-
We can use TF/TF-IDF (term frequency or term frequency inverse document frequency) to produce vectors which can represent our documents.
Another possible document encoding is word2vec. This is bett…
-
@minhnn1