Open cwellsarnold opened 8 years ago
Hi, have you looked into this example with --bidirectional
.
Thanks @nicholas-leonard! I have looked at the example and think I could adapt it with my targets. However, I'm still not clear on how I would set it up with multiple documents. For example, I do not want the context of the end of a previous document to impact the prediction of a label in the beginning of the next document. Does this make sense? Could SentenceSet
be adapted for this?
@cwellsarnold You could use SentenceSet for that. Or you could use TextSet but shuffle all the sentences beforehand, in which case, your model would implicitly learn to forget previous states after a sentence end.
@nicholas-leonard I'd like to preserve context between sentences in the same document (useful for co-reference resolution), but not across documents. Thus, my "sentences" would be entire documents. Would this pose a problem as they would be much longer than normal sentences? I'm a bit confused, but it also looks like batches are formed by finding sentences of the same length, which is why I attempted to created my own examples based on my defined context window (rho).
Ok, so yeah your sentences are entire documents. You could use SentenceSet, but I would still recommend using the TextSet. You don't need to train an your RNN/LSTM with a rho equal to the size of your document/sentence. Just use a smaller fixed size rho (100 or such). During evaluation, you can evaluate with an infinite rho where you just continuously loop through the entire cropus without ever forgetting. Even with SentenceSet, you wouldn't necessarily use a rho equal to your sentence size. In any case, TextSet is much easier to use.
@nicholas-leonard pardon my delayed reply, but I didn't find much free time over the holidays. Regarding your suggestion, it appears that TextSet assumes a single continuous stream of text. In this scenario, wouldn't I need to append different documents together and thus introduce false examples at document intersections as my documents are exchangeable (e.g., the end of a document has no impact on the beginning of the next document, and vice-versa)? Or is there a way that I could create many TextSets (one for each document) and use them in a more generic DataSource? Thanks!
@cwellsarnold You would indeed need to concatenate all documents. You could separate them by a <end document>
word. It would indeed introduce false examples. But only at document intersections (which are few compared to all the words), and the model (i.e. LSTM/GRU) could implicitly learn to forget the past hidden state when an <end document>
word is encountered.
There is no way to concatenate TextSets into a DataSource. However, if you just use your TextSets without the rest of dp, not encapsulating these into a DataSource, then you could manually call forget()
between TextSet processings.
Hi all, I am trying to do sequence tagging (i.e., a label for each token in a sequence) using the
BiSequencer
model with LSTMs and have been running into some trouble trying to determine how to format my input data and targets. I have built adp.DataSource
withdp.DataSet
s. Eachdp.DataSet
contains twodp.ClassView
s usingtorch.IntTensor
s. The shape of my input and target tensors isnum_samples
xnum_timesteps
, wherenum_timesteps
is the context size (i.e., the number of words to use before and after the current word being predicted). Each element in the tensor is the index of a word in my collection (or the word's label for the target vector), which I have computed using my own preprocessing tool. I am using aLookupTable
andSplitTable
in my model to convert these indices to embeddings.When I run my data through the network, my output is a table with
num_timestep
rows, with each row containing adp.DoubleTensor
of sizebatch_size
xnum_classes
(this, by the way, causes a problem with theConfusion
feedback object as it does not expect a table). However, the target tensor is of sizebatch_size
. I would have expected it to benum_timesteps
xbatch_size
?So, in short, how should I format my data in this case? Thanks for any help!!!