Closed BrikerMan closed 5 years ago
@alexwwang @HaoyuHu
I am not familiar with tf.keras yet so I read the document quickly. After that I think we should first get clear with some questions and make an agreement:
So, what about keep the current one and make a new edition with tf.keras to give maximum choice freedom to users?
I think we should agree on those two questions before any change, too.
So, I think migrating to tf.keras is a good idea.
The only thing tf.keras not good at is switch keras backend to CNTK or Theano. But I think 99% of Kashgari's users don't need this feature. So I recommend maintaining one edition with tf.keras.
Mostly I agree with you, especially on your thoughts about users and vision of this project, and advantages of TF.keras on technical level.
The only one thing I am not sure is if there're few nlp users that use MXnet/CNTK as their main backend framework. So at least I suggest keeping this edition which uses keras as backbone even we don't maintain it in near future and the period of new TF.keras edition development. Maybe in future there would be someone else who is interested in continuing develop it for those two groups of users.
That's my opinion, say, never lock a door completely. I think this is good to open source communities.
How do you think?
On Thu, 25 Apr 2019 at 19:05, Eliyar Eziz notifications@github.com wrote:
I think we should agree on those two questions before any change, too.
Who would use kashgari and their purpose?
- Academic users Experiments to prove their hypothesis
- NLP beginners Learn how to build an NLP project with production level code quality
- NLP developers Build a production level classification/labeling model
What's the purpose of the quality of this project? Production level or toy level?
- Product level is the vision. I think we should build a well organized and documented framework with high performance.
So, I think migrating to tf.keras is a good idea.
- tf.keras provides the same API with keras
- tf.keras provides better performance, easy to upgrade to tf2.0, easy to deploy the trained model on various platforms
- Better training policy support
The only thing tf.keras not good at is switch keras backend to CNTK or Theano. But I think 99% of Kashgari's users don't need this feature. So I recommend maintaining one edition with tf.keras.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77#issuecomment-486627696, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKS62J5PATIWBREA6XDPSGF6HANCNFSM4HIAZOGQ .
Of course, will keep the keras version as the legacy version for those who want to use MXnet/CNTK.
Let's finish #76 and #69 and keep a legacy version (v0.1 ~ v0.3), Then start working on the new version from v0.4 with the tf.keras.
Ok. I've no problem.
On Thu, 25 Apr 2019 at 19:29, Eliyar Eziz notifications@github.com wrote:
Of course, will keep the keras version as the legacy version for those who want to use MXnet/CNTK.
Let's finish #76 https://github.com/BrikerMan/Kashgari/pull/76 and #69 https://github.com/BrikerMan/Kashgari/pull/69 and keep a legacy version (v0.1 ~ v0.3), Then start working on the new version from v0.4 with the tf.keras.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77#issuecomment-486634350, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKXCT7KPVKMJHBGJ57TPSGIY7ANCNFSM4HIAZOGQ .
In the future, tf.keras will be more closely integrated with Tensorflow2. The improvement of computing performance and the convenient Tensorflow serving solution are of great significance to the engineering and landing of algorithm research. We only need to keep the current version, and create a new project based on tf.keras. :D
I assume it would be easy to transfer most part of current models and layers code to support tf.keras justy by just modifying their import declaration sector.
Am I right?
I assume it would be easy to transfer most part of current models and layers code to support tf.keras justy by just modifying their import declaration sector.
Am I right?
It will be easy for most of the model. But we need to wait for keras-bert and keras-gpt-2 to support tf.keras.
Actually, I am hoping to fully rewrite && clean up our code, change model save function to this https://www.tensorflow.org/alpha/guide/saved_model .
Ok, I will check it later.
On Thu, 9 May 2019 at 19:27, Eliyar Eziz notifications@github.com wrote:
I assume it would be easy to transfer most part of current models and layers code to support tf.keras justy by just modifying their import declaration sector.
Am I right?
It will be easy for most of the model. But we need to wait for keras-bert and keras-gpt-2 to support tf.keras.
Actually, I am hoping to fully rewrite && clean up our code, change model save function to this https://www.tensorflow.org/alpha/guide/saved_model .
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77#issuecomment-490864841, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKSMCBEC7T36NRIY4SDPUQDARANCNFSM4HIAZOGQ .
https://github.com/HighCWu/keras-bert-tpu This project is a fork of CyberZHG/keras_bert which supports Keras BERT on TPU. I have tried it and found it's not perfectly compatible with the ours.
@HaoyuHu I have communicated with CyberZHG. He is planning to support tf.keras, I think we just need to focus on how to rewrite and clean up our code. Here is the issue, https://github.com/CyberZHG/keras-ordered-neurons/issues/1
Interesting. That is fine if he is going to support tf.keras. BTW: Instead of keras.utils.multi_gpu_model(), can we consider using distribution strategies(such as MirroredStrategy) which are recommended in TF2.0 to support Multi-GPU? https://github.com/tensorflow/tensorflow/issues/26245#issuecomment-474611589
@HaoyuHu Yes, distribution strategies is a more elegant way to support multi-gpu/multi-device/ TPU training. We definitely should do that.
I have tried tf.keras for a classifier model, tf2.0 train tooks 9s for each epoch while tf1.13 need 68s. Prediction is 10x fast with tf2.0, too.
Exciting! For my use case, about 8 times faster.
Seems awesome!
On Mon, May 13, 2019, 01:01 hahahu notifications@github.com wrote:
Exciting! For my use case, about 8 times faster.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77#issuecomment-491611921, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKSFEZSCJ6AOA3ZT4U3PVBELZANCNFSM4HIAZOGQ .
Added a new branch tf.keras-version for tf.keras version. Planning to start from beginning.
Good to go!
Roll on.
@BrikerMan Do you have any idea on structural optimizing of the new edition? Or we just focus on save/load process improving and some other refactors?
I think the training process for the professional user should like this.
x, y = Corpus.load_data()
embedding = XXEmbedding()
# Save numerized input output(without padding) data and dict to cache_path.
embedding.build_on_corpus(x, y, cache_path='./cached_data')
classifier = CNNClassifier(embedding)
classifier.build_model(cached_data_path)
# Change model to keras_model
keras_model = classifier.keras_model
classifier.fit()
And for beginners
x, y = Corpus.load_data()
embedding = XXEmbedding()
classifier = CNNClassifier(embedding)
classifier.fit(x, y)
Added document to readthedocs.io, here is the link https://kashgari.readthedocs.io/en/tf.keras-version
I have added a pre-processor class, added document, and test-cases. Feel free to check it out and tell me if anything needs to change. Let's make Kashgari better.
Two thoughts by far:
For benchmarking models, since we've decided building kash on tf.keras, maybe we could take advantage of tf.server for models benchmarking in the preprocessor class or make it a helper/utility. How does it sound like?
Refer to sequence length, my opinion is it would be safer that as a default value to set to 90% pctg of raw sequences length. As long as throw out a warning of using default value and complete the document, users may aware this rule. How do you think?
Two thoughts by far:
- For benchmarking models, since we've decided building kash on tf.keras, maybe we could take advantage of tf.server for models benchmarking in the preprocessor class or make it a helper/utility. How does it sound like?
- Refer to sequence length, my opinion is it would be safer that as a default value to set to 90% pctg of raw sequences length. As long as throw out a warning of using default value and complete the document, users may aware this rule. How do you think?
Yea. I agree with both Ideas. As far as sequence length as None. Maye we could make it an option for pro users.
One more thing, I think we should consider adding the text-generating task as our third task, supporting classification, labeling and generating. Any suggestions about this idea. @alexwwang @HaoyuHu
Just added a demo labeling model. checkt it out. https://github.com/BrikerMan/Kashgari/blob/8c560551a9bc9e8841138adac80c59c53e73ad7a/kashgari/tasks/labeling/models.py#L122
Remember to check the document too. Link for demo document
I'm not familiar with text-generating task, what's your opinion @alexwwang ?
Unfortunately, me neither. AFAIK, text-generating task relates to GAN, of some different models with classification or labeling. I should refer to articles in topic GAN and related corpus before diving into this, but I know little details about them by far.
@alexwwang I have added multi-input support for all the embedding layers for tf-keras version. Users can use the multi-input function without changing any code. And going to add a multi-output example so that user can easily change model to support their data.
@BrikerMan Great to hear that! Did you implement it by modifying data generator method? I cannot help to see how you do.
@BrikerMan Great to hear that! Did you implement it by modifying data generator method? I cannot help to see how you do.
I had added a new class called Processor. Each model has its own embedding, and each embedding contains a processor. We could replace the processor without changing any code.
I see, decoupling. Wait for your push!
On Tue, 21 May 2019 at 17:49, Eliyar Eziz notifications@github.com wrote:
@BrikerMan https://github.com/BrikerMan Great to hear that! Did you implement it by modifying data generator method? I cannot help to see how you do.
I had added a new class called Processor. Each model has its own embedding, and each embedding contains a processor. We could replace the processor without changing any code.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77?email_source=notifications&email_token=AAGRFKUAILQDW7RX53EZTC3PWPATBA5CNFSM4HIAZOG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV3MBNY#issuecomment-494321847, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKR4KWKYPH66O232EMLPWPATBANCNFSM4HIAZOGQ .
@alexwwang Pushed, check it out.
Now we could finalize the tf.keras version's release.
Todo list:
Which part you guys want to work on? @alexwwang @HaoyuHu
One more thing, should we simplify our basic models by decrease the neuron count?
Try best to do some part I'm familiar with. 😄
I think we could learn from keras-bert project, to build a middle layer to hold all backend's dependents.
Try best to do some part I'm familiar with. 😄
How about add classification models, test and doc?
Here is the labeling tutorial doc
I could dive in from here.
On Wed, Jun 12, 2019 at 4:01 PM Eliyar Eziz notifications@github.com wrote:
Try best to do some part I'm familiar with. 😄
How about add classification models, test and doc?
Here is the labeling tutorial doc https://kashgari.readthedocs.io/en/tf.keras-version/tutorial/sequence_labeling_model.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BrikerMan/Kashgari/issues/77?email_source=notifications&email_token=AAGRFKV4KEMWH7LYCOOEHZDP2CUL3A5CNFSM4HIAZOG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXPSRHA#issuecomment-501164188, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGRFKSPNJTYLATXTW37Y23P2CUL3ANCNFSM4HIAZOGQ .
@BrikerMan Sorry for the late reply. I'm interested in advance use-case. At the same time, I can participate in some other documentation work.
@HaoyuHu Here is several advance use-case examples, maybe you could start from here, fell free to update documents about these use-cases.
@alexwwang @HaoyuHu I am thinking about rename all models name for readability, Any idea?
Here are several styles I found. For example, renaming BLSTMCRFModel
:
Bi_LSTM_CRF_Model
BiLSTM_CRF_Model
Bi_LSTM_CRF # I prefer this one
BiLSTM_CRF
@HaoyuHu Here is several advance use-case examples, maybe you could start from here, fell free to update documents about these use-cases.
I will delve into these. :smile:
@alexwwang @HaoyuHu I am thinking about rename all models name for readability, Any idea?
Here are several styles I found. For example, renaming
BLSTMCRFModel
:Bi_LSTM_CRF_Model BiLSTM_CRF_Model Bi_LSTM_CRF # I prefer this one BiLSTM_CRF
I prefer the last two.
I think the second and the last is the most concise.
The question is whether _Model
suffix is necessary.
I think the second and the last is the most concise. The question is whether
_Model
suffix is necessary.
I think you are right, let’s keep the Mode
. Use the Second one. Camel Case + underline between uppercase names.
Camel case for a structure name of a main unit in model and underline for combination of all units a model utilized and last model to indicate it's a model since similar naming rule is used for layer part but without _Model
suffix.
It's my thought. Any better idea? @BrikerMan @HaoyuHu
Camel case for a structure name of a main unit in model and underline for combination of all units a model utilized and last model to indicate it's a model since similar naming rule is used for layer part but without
_Model
suffix.It's my thought. Any better idea? @BrikerMan @HaoyuHu
Could you give several samples?
I am proposing to change keras to tf.keras for better performance, better serving and add TPU support. Maybe we should re-write the whole project, clean up the code, add missing documents and so on.
Here are the features I am planning to add.