Closed rgtjf closed 1 year ago
2、组合模式,能收集到所有使用的Feature名字 参考lasagne得到所有参数、得到所有名字
1、记录下特征,包括记录下原来的句子,及一些信息, 以csv记录
如何自动记录?
Ref:
def get_all_layers(layer, treat_as_input=None):
"""
This function gathers all layers below one or more given :class:`Layer`
instances, including the given layer(s). Its main use is to collect all
layers of a network just given the output layer(s). The layers are
guaranteed to be returned in a topological order: a layer in the result
list is always preceded by all layers its input depends on.
Parameters
----------
layer : Layer or list
the :class:`Layer` instance for which to gather all layers feeding
into it, or a list of :class:`Layer` instances.
treat_as_input : None or iterable
an iterable of :class:`Layer` instances to treat as input layers
with no layers feeding into them. They will show up in the result
list, but their incoming layers will not be collected (unless they
are required for other layers as well).
Returns
-------
list
a list of :class:`Layer` instances feeding into the given
instance(s) either directly or indirectly, and the given
instance(s) themselves, in topological order.
Examples
--------
>>> from lasagne.layers import InputLayer, DenseLayer
>>> l_in = InputLayer((100, 20))
>>> l1 = DenseLayer(l_in, num_units=50)
>>> get_all_layers(l1) == [l_in, l1]
True
>>> l2 = DenseLayer(l_in, num_units=10)
>>> get_all_layers([l2, l1]) == [l_in, l2, l1]
True
>>> get_all_layers([l1, l2]) == [l_in, l1, l2]
True
>>> l3 = DenseLayer(l2, num_units=20)
>>> get_all_layers(l3) == [l_in, l2, l3]
True
>>> get_all_layers(l3, treat_as_input=[l2]) == [l2, l3]
True
"""
# We perform a depth-first search. We add a layer to the result list only
# after adding all its incoming layers (if any) or when detecting a cycle.
# We use a LIFO stack to avoid ever running into recursion depth limits.
try:
queue = deque(layer)
except TypeError:
queue = deque([layer])
seen = set()
done = set()
result = []
# If treat_as_input is given, we pretend we've already collected all their
# incoming layers.
if treat_as_input is not None:
seen.update(treat_as_input)
while queue:
# Peek at the leftmost node in the queue.
layer = queue[0]
if layer is None:
# Some node had an input_layer set to `None`. Just ignore it.
queue.popleft()
elif layer not in seen:
# We haven't seen this node yet: Mark it and queue all incomings
# to be processed first. If there are no incomings, the node will
# be appended to the result list in the next iteration.
seen.add(layer)
if hasattr(layer, 'input_layers'):
queue.extendleft(reversed(layer.input_layers))
elif hasattr(layer, 'input_layer'):
queue.appendleft(layer.input_layer)
else:
# We've been here before: Either we've finished all its incomings,
# or we've detected a cycle. In both cases, we remove the layer
# from the queue and append it to the result list.
queue.popleft()
if layer not in done:
result.append(layer)
done.add(layer)
return result
Ref:
def get_all_params(layer, **tags):
layers = get_all_layers(layer)
params = sum([l.get_params(**tags) for l in layers], [])
return utils.unique(params)
laze operation
feature = Feature('name')
feature.add(new UniGramFeature('unigram'))
feature.add(new BiGramFeature('bigram', load=True))
feature.add(new TriGramFeature('trigram'))
feature.input = x, y
feature.extract()
MergeFeature([namelsit], name);
for name in feature.feature_names:
print(name)
feature 与 class 结合
model = new Model('name', Classify)
#model.train()
#model.test()
model.add(new UniGramFeature('unigram'))
model.add(new BiGramFeature('bigram', load=True))
model.add(new TriGramFeature('trigram'))
emb_model = new Model('name', Classify)
emb_model.add(Fearures)
emb_model.add(Features)
model.add(emb_model)
model.train()
model.test()
如何创建dict?
class dict_loader 支持-类单利模式 支持-函数单利模式
@singleton
class dict_loader(object):
def __init__(self):
self.stopwords = None
def load_stopwords(self):
if self.stopwords == None:
''' load stopwords from file '''
fp = open('english.stopwords.txt', 'r')
english_stopwords = [line.strip('\r\n ') for line in fp.readlines()]
self.stopwords = stopwords
return self.stopwords
# dict_loader().load_puntcs()
dict 支持 key <==> index的一一映射
满足以下要求: 1、记录下特征,包括记录下原来的句子,及一些信息, 以csv记录 2、组合模式,能收集到所有使用的Feature名字