Some plan - Githubissues

chengchingwen / Transformers.jl

Julia Implementation of Transformer models

MIT License

526 stars 75 forks source link

Some plan #83

Open chengchingwen opened 2 years ago

chengchingwen commented 2 years ago

Here are some stuff I'm going to rewrite for the new release:

[x] Tokenizer: Define the tokenizer with TextEncodeBase.jl and replace the old Basic.Vocabulary with TextEncodeBase.Vocab.
[X] Layers: Rewrite the attention layer with NeuralAttentionlib.jl
[X] Huggingface: Use HuggingFaceApi.jl for download and manage files, ~and use StructWalk.jl to transform the state_dict~. Remove the Pretrain submodule and use the huggingface one.

feel free to add comments.

chengchingwen commented 2 years ago

The new tokenizer api (using TextEncodeBase) is basically finished and included in the 0.1.16 release, though the gpt part is ignored for now. For the next step, I will be fixing the huggingface download issue with HuggingFaceApi.jl. Rewriting the attention layer might be breaking, so that would probably be the last one to do.

Some other issue that might also need to be tracked:

[x] update gpt with the new api
[x] Most documents are outdated
[ ] "time-to-first-plot" issue
[ ] Redesign (or get rid of) the Datasets api
[x] Remove internal use of Stacks

MNLubov commented 2 years ago

@chengchingwen Peter, what is the approximate timeframe for implementing the model transfer from Huggingface?

chengchingwen commented 2 years ago

@MNLubov Are you looking for a specific model from HuggingFace? I'm trying to fix the huggingface module this month, so if everything goes well, it would be workable again before August.

Just to clarify, even if that huggingface module is fixed, it's still possible that we don't have the implementation for that model type (by model type, I mean something like bert, gpt2, t5 etc). So if you are looking for a model type that we don't have, please open another issue (and the timeline would be unknown for now

MNLubov commented 2 years ago

@chengchingwen Thanks for the clarification. Currently I am testing different sentence-transformers from Huggingface to find the most suitable for my purposes. As a temporary solution, I use PyCall to find the most suitable one. As far as I understand you have now bert, gpt and roberta implementations.

chengchingwen commented 2 years ago

@MNLubov Yes. I haven't investigate the sentence-transformers implementation, but it seem that it can also be done with normal huggingface interface. Like this one https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2, it's a bert model, so it should be workable following the huggingface transformer usage in the readme after we fix the module.