FluxML / FluxML-Community-Call-Minutes

The FluxML Community Team repo
50 stars 4 forks source link

Add language models #3

Closed darsnack closed 3 years ago

darsnack commented 4 years ago

Add standard language models.

ToucheSir commented 4 years ago

Prior art: https://github.com/chengchingwen/Transformers.jl

darsnack commented 4 years ago

Looks great to me! We could just reexport these models. Or ask the maintainer if it makes sense for Transformers.jl to be a base for transformer-like models, and we can have SOA models that are implemented using Flux + Transformers.jl in our repo.

ToucheSir commented 4 years ago

Pinging @chengchingwen for visibility. I think it would be good to know areas where the existing Flux ecosystem is lacking for NLP workflows as well.

chengchingwen commented 4 years ago

Hi, first time here. Could someone give me a brief summary of the context?

chengchingwen commented 4 years ago

Btw, I'm working on the GSOC project that will cover some pretrain models from huggingface/transformers

darsnack commented 4 years ago

This is an effort by a bunch of Flux users to improve the ML ecosystem for Julia. We are using this repo to track issues across many packages. The main idea is that users for other ML frameworks land on the TF home page or PyTorch home page, and that's their one stop shop for everything in the framework. Julia ML doesn't operate that way (nor should it necessarily). But we want to make things smoother and more coordinated for new and experienced users.

One issue that has come up often is that we don't have standardized (pre-trained) models available. So, we are trying to create FluxModels.jl that provides these models for various domains. We will use that repo to provide pre-trained models to users, and we will use it to run benchmarking and regression tests. This way we know if changes to Flux or Zygote reduce performance on standard datasets and models.

Some domains already have models implemented (e.g. Transformers.jl). We don't want to usurp the work done by other members of the community. So, we are trying to figure out the best way to integrate all of this together.

ToucheSir commented 4 years ago

Thanks for chiming in! We're trying to start a community effort to identify bottlenecks/gaps in the current Flux ecosystem (all tracked on this repo's project for now). If you have anything to add based on your experience with Transformers.jl or other DL code, please don't hesitate to add it here.

darsnack commented 4 years ago

Another thing I'll add is that Flux is missing a lot of critical functionality related to data loading, pre-processing, etc. We will be building interfaces and packages to provide these, and it would be nice if those were used by other packages across the ecosystem that already exist. This way the code is consistent across implementations, and a new user doesn't feel like there are 10 different ways to do the same pre-processing step.

chengchingwen commented 4 years ago

If you have anything to add based on your experience with Transformers.jl or other DL code, please don't hesitate to add it here.

I'll say that Flux lack some operators/functions, w/ or w/o GPU. Most of the time Transformer are handling high dimensional tensor, but we miss these kind of "batched" operators. It's not that pleasant for the new comer to see an op exist in TF/Torch but not in Flux. And LOTS of packages have to copy the same implementation over and over again (like the batched matrix multiplication).

SomTambe commented 4 years ago

@darsnack I would like to help with adding functionality related to data loading and pre-processing. I have worked on pytorch only yet, and I had faced some problems while loading 3d-vision datasets. To resolve my issues, I had to modify some of the source code of some dataset classes, and due to that I had a peek at the inner functionality of pytorch. I have used Julia since 7 months now, I believe I could help with this task.

Also, there exists a DataLoader in Flux, could you elaborate on what you plan to add when you said functionality related to data-loading ? I undertand pre-processing requirements, but do not understand data-loading requirements.

Kindly update me with what you plan to do, I would be more than happy to add functionality 😀 .

darsnack commented 4 years ago

@SomTambe I pinged you on the relevant issue along w/ Peter Wolf who is leading the current efforts. And we'd love to have your input both here and on Zulip concerning your issues and what you'd expect out of such functionality.

darsnack commented 3 years ago

Closing this issue since Transformers.jl is well-maintained, and there are no current plans from the community team to work on pretrained NLP models.