issues
search
weka511
/
nlp
My experiments with Natural Language Processing. I've created a few programs to try out concepts.
GNU General Public License v3.0
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Word2vec2: build vocabulary appears greedy
#39
weka511
opened
11 months ago
2
The examples files from blogs.zip is too big to train on
#38
weka511
opened
11 months ago
0
Parsing errors in blogs.zip
#37
weka511
closed
11 months ago
0
Word2Vec2: add blogs.zip to corpus
#36
weka511
closed
11 months ago
0
Display default values with help text
#35
weka511
closed
12 months ago
0
Declutter the git status list
#34
weka511
closed
12 months ago
0
Allow user to suppress checkpoint
#33
weka511
closed
1 year ago
0
Loss increasing following resume
#32
weka511
closed
1 year ago
6
Save figures to a separate directory
#31
weka511
closed
1 year ago
0
Allow controlled stop using stopfile
#30
weka511
closed
1 year ago
0
A noise word should not be the target word
#29
weka511
closed
1 year ago
0
Show eta in "e" format, rather than "f"
#28
weka511
closed
1 year ago
0
Implement momentum
#27
weka511
opened
1 year ago
0
Choose tau automatically (Goodfellow et al)
#26
weka511
closed
1 year ago
0
Scale loss by number of data points
#25
weka511
closed
1 year ago
0
Are frequencies accurate when we create examples?
#24
weka511
closed
1 year ago
0
Overflow during training
#23
weka511
closed
1 year ago
3
Record the paramters used for creating data and training so we can reconstruct
#22
weka511
closed
1 year ago
0
Check that we aren't specifiying file names when we resume
#21
weka511
closed
1 year ago
0
If Loss is Inf, stop training
#20
weka511
closed
1 year ago
0
Preprocess data
#19
weka511
closed
1 year ago
1
Use Noise Contrastive Esimation (NCE) for word2vec
#18
weka511
closed
1 year ago
3
Explore clustering of word vectors
#17
weka511
opened
1 year ago
1
Declutter data
#16
weka511
closed
1 year ago
0
Explore Transformers
#15
weka511
opened
1 year ago
1
Explore probabilty distribution of w.w and w.c in word2vec2
#14
weka511
opened
1 year ago
0
Create my own skip-gram implementation
#13
weka511
closed
1 year ago
0
Implement test cases for tf-idf
#12
weka511
closed
1 year ago
0
Implement tf-idf
#11
weka511
closed
1 year ago
0
Port word2vic to Python 3.11 and WimgIDE 9
#10
weka511
closed
1 year ago
0
Does Stochastic Gradient improve performance?
#9
weka511
closed
1 year ago
0
Is word2vec broken?
#8
weka511
closed
1 year ago
1
Word2Vec is slow - can we speed up using ides from [Mikolov et al](https://arxiv.org/abs/1301.3781/)?
#7
weka511
closed
1 year ago
0
Non-corpus is now obsolete
#6
weka511
closed
3 years ago
0
Allow for new results file to be added with sequnce number.
#5
weka511
closed
3 years ago
0
Allow training to resume from an earlier run
#4
weka511
closed
3 years ago
0
Create separate script for plotting
#3
weka511
closed
3 years ago
0
Handle larger corpus
#2
weka511
closed
3 years ago
1
Some word2vec test results look dodgy
#1
weka511
closed
3 years ago
1