caravelahc / paratex

Extrator de presença parlamentar.
11 stars 7 forks source link

Remove Pandas dependency #26

Closed JPTIZ closed 3 years ago

JPTIZ commented 3 years ago

Greatly reduces dependencies' weight by not depending on pandas. This is good for now as pandas is not used for any complex operation - instead, it was used only and just to sort and join rows and save them into a CSV file. As can be seen, there's little difference to code lines themselves (except for defining a type for CSV files), even in numbers.

OBS: Merge after #24

JPTIZ commented 3 years ago

Was there much difference in the time needed to run the operations? Probably the same right?

Yes, no visible differences. Didn't profile, tho, but downloading all sessions since 2011 was really fast even if you consider the web accesses, so...it is definetely not our bottleneck.

I don't like much the decision because it feels like it will eventually come back if we need more complex operations, but it will be easy to reinsert them if needed so it's probably fine

I understand, but since we really not need it now (and there's a chance we'll still not need it later), it'll be a heavy downside to keep it for now since the dependency installation process may take ages.