Closed fchollet closed 7 years ago
I have some code for music generation and a dataset that I can share as an example (currently not working due to the problems with masking, but this is already on the to-do list). It's similar to char-rnn but predicts musical "tokens" instead of characters.
@jfsantos sounds great. It would be neat to turn this into a resuable app (e.g. provide a folder with enough MIDI files or audio files in a certain style, and start generating MIDI tracks or audio files in that style). What is the "token" space you were using?
The token space I used are ABC notation symbols. They are mostly used for representing music for a single instrument (mostly monophonic, even though there's a notation for chords). I don't know if there are a lot of datasets in this format, but there's the one I used (which contains ~25k tunes).
The code could probably be converted to use MIDI or another format instead of ABC. For other formats, we would need a parser. I considered using the parsers from music21 but that would add an external dependency to the example.
MIDI would certainly be a better format to allow a wide range of people to play around with it. It's a good starting point. I think the killer app would involve learning from audio files and generating audio files, with some "clean" data representation in between (possibly derived from ABC). Previous attempts have been doing it completely wrong, but we could do it right.
Regarding masking, I'm trying to implement a feed-forward network using Graph
like the following:
Embedding -> Flatten -> Dense -> ...
I'm padding my short sequences with 0 both in input and outputs. If I set mask_zero=True
for the embedding layer, the Flatten and Dense layers are broken as they are not supposed to be used with masks. Changing keras/layers/core.py
so that they are derived from MaskedLayer
instead of Layer
makes the system at least train but I'm not sure if the inner parts are nicely playing with the masks. I assume that this wouldn't be so simple to fix this way :)
may you recommend some paper/video/book/ code examples links to study more about it pls?
On Tue, Jan 5, 2016 at 8:55 AM, Ozan Çağlayan notifications@github.com wrote:
Regarding masking, I'm trying to implement a feed-forward network using Graph like the following:
Embedding -> Flatten -> Dense -> ...
I'm padding my short sequences with 0 both in input and outputs. If I set mask_zero=True for the embedding layer, the Flatten and Dense layers are broken as they are not supposed to be used with masks. Changing keras/layers/core.py so that they are derived from MaskedLayer instead of Layer makes the system at least train but I'm not sure if the inner parts are working correctly with the masks. I assume that this wouldn't be so simple to fix this way :)
— Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/1399#issuecomment-169007722.
We need a K.tensordot
which mimics theano's batched_tensordot
but should also work on tensorflow. Memory networks are impossible without dot merge.
That's true, but I think we can wait for TensorFlow to implement tensor contraction. Rolling out our own implementation would be inefficient.
On 5 January 2016 at 10:26, Fariz Rahman notifications@github.com wrote:
We need a K.tensordot which mimics theano's batched_tensor_dot but should also work on tensorflow. Memory networks are impossible are impossible without dot merge.
— Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/1399#issuecomment-169089098.
may you recommend some paper/video/book/ code examples links to study more about it pls?
Study what, Keras? Here's a pretty good video intro: https://www.youtube.com/watch?v=Tp3SaRbql4k
Adding new apps is definitely a great step. What I would recommend is start with making the current examples interactive. For e.g, after training babi_memnn, the user should be able to input a story and a question (as natural language text, not word index) and ask questions about it to the model. Instead of each example being a single python file, each should be a folder with sub folders train_data
, 'test_data
and separate scripts train.py
and test.py
. This will give absolute control to the user, at the cost of save_weights
and load_weights
.(train.py saves and test.py loads h5py file). Also, there should be explicit examples for visualization.
I am really happy for hearing about these things, And I think for researchers, The state-of-the-art performance's models are in need. And I suggest If you want to make the examples interactive, It is better to give the users a GUI version. In my opinion, The guys who use keras are either do research or their business, rather than having fun. Anyway, For me, just a beginner of deep learning for one year, which is almost the same age like keras? It is time to post papers, And I think many people also in the same case. Wishing keras will add some baseline model proposed in the research papers, And I will do some effort as much as I can
I agree. I think people using Keras are mostly for serious stuff (research/business) rather than having fun. I would expect Keras supporting more state-of-the-art models rather than making the examples interactive.
cool just great but too short may share more links like this,pls
On Tue, Jan 5, 2016 at 2:13 PM, François Chollet notifications@github.com wrote:
may you recommend some paper/video/book/ code examples links to study more about it pls?
Study what, Keras? Here's a pretty good video intro: https://www.youtube.com/watch?v=Tp3SaRbql4k
— Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/1399#issuecomment-169101046.
@Sandy4321 That video covers pretty much all the basics. Also checkout the documentation and examples. If you need help with any specific problem, consider opening a new issue.
Update on our progress so far:
What blogging platform would you guys suggest for the Keras blog? Requirements:
Maybe we'll end up falling back to Github for content management + S3 for hosting + a custom static site generator. Wouldn't be the first time for me.
Also, what hosting platform would you guys suggest for the (500+MB) weight files of a Keras model Zoo? Hosting on my personal S3 account (as I do for Keras datasets) would be prohibitively expensive.
I mean, how many weight files are we expecting? A quick check on the AWS calculator shows that 10GB will run ~64 cent/mo.
@lukedeo hosting would be inexpensive. It's downloads that are the problem. Keras has around 30k active users, so we could realistically expect several TBs of downloads every month, which would potentially cost hundreds every month.
Yikes, I didn't realize Keras was at 30k! I remember reading that rackspace doesn't charge based on bandwidth...might be an option.
@fchollet I'm going to test my music generation models this week. It's still based on a textual representation of music but it's a start.
Regarding blogging platforms, I recommend Pelican, a static site generator written in Python and aimed at blogs. There's plenty of templates to choose from and it's fairly easy to write your own. It also has a plugin interface for adding generation of pages (e.g. I have one for generating a list of publications from a BibTeX file). We could host it on Github pages (that's what I do for my website). Here's one of my blog posts using LaTeX rendering and code snippets.
What about just use GitHub pages for blog? It can be written with markdown and controlled under git. Jekyll could be the tool to generate it.
About the QA system, I'd like to implement with the seq2seq model in this paper. But it seems difficult to implement in Keras since it's not easy to copy the encoder RNN's hidden state to the decoder's. Maybe I can try to train the model in examples/addition_rnn.py
with some movie subtitles and see the results.
@farizrahman4u Thanks. I've found this project before. It is awesome. But it has some custom layers and I don't know if it is a good idea to use that as an example. I think it's better to just stack some exist layers in an example. Maybe merge your layers into the upstream Keras is a good idea?
@wb14123 As you said, custom layers. They are kind of hackish, and do not work with tensorflow. So I don't think it meets the Keras standards, hence the separate repo.
@jfsantos thanks for the suggestions. Pelican + Github Pages sounds good, we'll probably do that.
Suggestions to increase appeal to industry : Integration with blaze for learning across many backends (databases, out of core dataframes etc
Timeseries prediction
Hey, I've been using Keras for a couple of weeks now and I'd like to contribute in some way! I'd love to take on some sort of NLP-related example task. Also, this'd be my first open source project.
@Anmol6 Try adding multiple hops to the memory network example as mentioned in the paper. Should be a nice start.
@farizrahman4u which paper? and you mean this example: https://github.com/fchollet/keras/blob/master/examples/babi_memnn.py?
Yes. That one. But as you can see, there is only one memory hop, so it will work only for babi task 1. But if you do multiple hops(3 at least), you can do this:
You can get theano code from https://github.com/npow/MemN2N
I see, I'll try that out. Thanks!
Hey so I'm working on getting the multiple hops done. I'm having trouble figuring out how the code at https://github.com/fchollet/keras/blob/master/examples/babi_memnn.py is employing this step outlined in the paper(if at all):
If that's not being used, could you explain the logic behind the model in the code? Thanks!
Its actually easier than you think. In memory hop1, the output is a function of the question and the story. This is already done in the keras example. In memory hop2, the output is a function of the question, the story and the output of hop1.
@farizrahman4u maybe this should move into a more specific issue but I was also confused about the BaBi example, it's not really obvious to me that it implements memory networks.
The match
seems to correspond to pre-softmax p vector, but I don't think there's any weighed sum going on, except if I'm confused by the embedding of memories to query_maxlen-dimensional space that I didn't really understand.
The way I'd reproduce the MemN2N construction in the current framework would be to add softmax activation to match
, embed input_encoder_c
to 64d, and compute match-weighted sum of input_encoder_c
elements by A. RepeatVector(64) the match
to be able to dot-product B. dot-product the match and inpute_encoder_c
. There shouldn't be any place where LSTM enters at this point as the shape at that point is just (batch, 64)? Does that make sense? If the current construction is somehow equivalent to that, sorry for the noise, it's lost on me though.
However, this wouldn't really reproduce MemN2N anyway since it treats memories at a word level, picking the relevant words rather than relevant sentences, which is the story-to-memory segmentation the memory networks use. For that, we'd have to bump the dimensionality of the input and put each memory in a separate 2d tensor, then either use averaging or RNNs to get memory embeddings (which might be possible with the very latest git I guess?).
(P.S.: I work on a bunch of related Keras models that model sentence similarities (at the core that's what MemNNs do too), e.g. https://github.com/brmson/dataset-sts/blob/master/examples/anssel_kst1503.py but I already have some way more complicated ones (e.g. almost reproducing 1511.04108) in my notebooks that I hope to tweak and publish soon - once my deadlines pass during February, I'll be happy to clean up and contribute them to Keras as examples.)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@pasky Though I am very late on this mail thread, but I completely agree that current babi_memnn.py implementation does not treat memory at sentence level. I am trying to implement end to end memory networks and would appreciate if you can share the code that you have written for the same.
Last month, we delivered on our key development goals for the period. Keras has made great strides in code quality and documentation.
Here's an update with our new goals. On one side, we will continue improving the codebase and feature set of Keras. On the other side, we will start focusing more on providing the community with a wealth of real applications, rather than just library features. As deep learning engineering becomes increasingly commoditized (most notably by Keras), Keras needs to move up the ladder of abstraction and start providing value at the application level in order to stay relevant for the next 5 years.
These applications will roughly fall into two categories:
Development:
TimeDistributed
,Highway
,Residual
(residual learning).Applications:
As a closing note, I am noticing that the October-December period, rich in ML conferences, has seen the release of over 15 research papers using Keras for their experiments (plus an unknowable count of papers that used Keras without citing it --a majority of papers never cite the open source frameworks they use). This is a positive sign : )