Closed hcvazquez closed 4 years ago
Hi @hcvazquez, great question. This is due to index sorting, and isn't reflected well in the docs right now (but on our list to update). This was discussed on the Spectrum thread here: https://spectrum.chat/snorkel/help/how-to-use-the-pandasparallelapplier~cf50f563-28e6-418c-93a3-337384566c13
Closing for now, feel free to re-open!
Issue description
I'm using PandasParallelLFApplier to apply labeling functions to a pandas dataframe with 5000 rows.
Code example/repro steps
Using PandasParallelLFApplier
Same code using PandasLFApplier
Second row is different.
Expected behavior
I would expect the same result for both. Labeling coverage and overlaps is the same for both. Because of that the problem has to be the order of the rows.
System info