Closed icaoberg closed 3 months ago
@gesinaphillips pointed out my code is bad :-1:
It can be optimized by making a single call to hubmapbags.apis.get_dataset_info but I will need to find an alternative to parallel_apply (more than likely something like multiprocessing).
hubmapbags.apis.get_dataset_info
parallel_apply
multiprocessing
@fshormin imagine
for index, row in df.iterrows(): ....
can I process these rows in parallel? Investigate multiprocessing or any other out of the box solutions.
For columns I can use pandarallel.
pandarallel
@gesinaphillips pointed out my code is bad :-1:
It can be optimized by making a single call to
hubmapbags.apis.get_dataset_info
but I will need to find an alternative toparallel_apply
(more than likely something likemultiprocessing
).