JuliaText / Embeddings.jl

Functions and data dependencies for loading various word embeddings (Word2Vec, FastText, GLoVE)
MIT License
81 stars 19 forks source link

Adding Paragram Embedding #30

Closed tejasvaidhyadev closed 4 years ago

tejasvaidhyadev commented 4 years ago

I am adding Paragram Embedding and Google-drive download fetch method for datadeps in Paragram.I think it will be great to have default G-drive download fetch method in Datadeps

oxinabox commented 4 years ago

I am adding Paragram Embedding and Google-drive download fetch method for datadeps in Paragram.

I think it will be great to have default G-drive download fetch method in Datadeps

I made a prototype of this a few years back. It would be nice to see something more polished. https://github.com/oxinabox/PyDrive.jl idk if in DataDeps or in a GoogleDrive.jl package that supports DataDeps (potetially similar to how AWSS3.jl supports DataDeps) I guess its not much code so it could go in DataDeps. But being able to add Auth (like the PyDrive.jl demo does) the code code grows so might be better to have it else-where

I agree that it doesn't belong here. It would be kind if nice to hold on this PR til that code found a better home. But if you need these for a project we can look at keeping it here for now.

tejasvaidhyadev commented 4 years ago

For now, I don't need it for any project but would love to find a home to code. Any suggestion how to proceed ?

oxinabox commented 4 years ago

I suggest you create a GoogleDrive.jl package, based on the code you have. (use PkgTemplates if you are not already).

Register that package, then we will add it as a dependency of Embeddings.jl

tejasvaidhyadev commented 4 years ago

Hi @oxinabox I created GoogleDrive.jl Package , and also opened a Registration PR After the PR is merged I will push all the above suggested changes to this PR.

tejasvaidhyadev commented 4 years ago

@oxinabox as suggested GoogleDrive.jl as dependency and common.jl is added

codecov-io commented 4 years ago

Codecov Report

Merging #30 into master will decrease coverage by 0.81%. The diff coverage is 88.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #30      +/-   ##
==========================================
- Coverage   96.77%   95.95%   -0.82%     
==========================================
  Files           4        6       +2     
  Lines          93       99       +6     
==========================================
+ Hits           90       95       +5     
- Misses          3        4       +1
Impacted Files Coverage Δ
src/Embeddings.jl 100% <100%> (ø) :arrow_up:
src/glove.jl 100% <100%> (+8%) :arrow_up:
src/Paragram.jl 80% <80%> (ø)
src/common.jl 89.47% <89.47%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 65390f1...ac45848. Read the comment docs.

oxinabox commented 4 years ago

thanks

tejasvaidhyadev commented 4 years ago

Hi @oxinabox , I will soon implement _load_embeddings_csv for FastText as well. sorry for delay because of my exams are going on.