Closed jeff1evesque closed 5 years ago
448d894: the following implements the mongodb finalize
step, which returns None
for cases where the reducer is not executed since the corresponding link_id
is a single instance:
{'_id': 't3_3vjv', 'value': None}
{'_id': 't3_3vl9', 'value': None}
{'_id': 't3_3vlu', 'value': None}
{
'_id': 't3_3vlz',
'value': {
'score': [2.0],
'match_id': ['c3vph'],
'comments': ["why do you think it's a fake? what's so hard in doing a vnc loopback?"],
'posts': ['A fake, I suppose. Still worth a look. And a smile ;)']
}
},
{'_id': 't3_3vml', 'value': None}
{'_id': 't3_3vmq', 'value': None}
{'_id': 't3_3vn3', 'value': None}
{'_id': 't3_3vox', 'value': None}
{'_id': 't3_3vpb', 'value': None}
{
'_id': 't3_3vqd',
'value': {
'score': [2.0],
'match_id': ['c3wvv'],
'comments': ["(I'm in the business world where skills don't matter, unlike techie land)"],
'posts': ['Or: "I have the skills and I need the money, but you\'re going to hire the person that went to the same college as you anyway so I might as well get a head start on waiting those tables."']
}
},
{'_id': 't3_3vrj', 'value': None}
{'_id': 't3_3vso', 'value': None}
{'_id': 't3_3vt0', 'value': None}
{'_id': 't3_3vt1', 'value': None}
{'_id': 't3_3vto', 'value': None}
{'_id': 't3_3vu0', 'value': None}
{'_id': 't3_3vu9', 'value': None}
{'_id': 't3_3vuh', 'value': None}
{
'_id': 't3_3vut',
'value': {
'score': [1.0],
'match_id': ['c3vxm'],
'comments': ["That will get you really far...\r\n\r\nSaying that marketing like this cannot be studied is like saying that math cannot be because it is to complex and there will always be unsolved problems.\r\n\r\nI was referring to methods along the lines of what's discussed in: `The Anatomy of Generating Buzz` by Emanuel Rosen. Unfortunately I do not have this book and a reddit search on buzz turned up nothing but bees.."],
'posts': ["Don't. Just make it good enough and it will generate Buzz for itself."]
}
},
{'_id': 't3_3vwd', 'value': None}
{'_id': 't3_3vwm', 'value': None}
{'_id': 't3_3vx9', 'value': None}
The following preserves the index order between lists:
>>> x={1:['a','b','c'], 2:['a', 'i']}
>>> y={1:['d','e','f'],2:['g']}
>>>
>>>
>>> for k, v in x.items():
... if k in y.keys():
... y[k] += v
... else:
... y[k] = v
...
>>>
>>> print(y)
{1: ['d', 'e', 'f', 'a', 'b', 'c'], 2: ['g', 'a', 'i']}
Therefore, we'll perform an aggregation on the mapreduced values respectively.
We need to collapse our returned result from the mapreduce as python lists. This will be a necessary step before tokenization related techniques.