Closed santiarcar closed 2 years ago
Hi @santiarcar,
The progress bar is not implemented when performing axis=1 applies on dataframes containing strings. This is because these applies are implemented via Modin and there is not an easy way to leverage the progress bar for Modin dataframes, yet.
If you want to force swifter to use Dask in these instances, you can do 'df.swifter.allow_dask_on_strings().apply(...)'
Using Dask will enable the progress bar, but will also be somewhat less performant than the default Modin apply.
I hope that helps! Jason
May be of interest: https://github.com/modin-project/modin/pull/1589
This seems to make sense, but oddly, I see the progress bar for df's with < 100k rows, but for df's with > 100k rows I don't. Could this indicate that the engine isn't using Dask for smaller df's and is instead just applying a vanilla tqdm progress_apply()?
Hi @edridgedsouza yes that is the case. For small df's it does fall back to pandas. In the following link I show where it samples the dataframe apply to determine whether or not to use modin/pandas. https://github.com/jmcarpenter2/swifter/blob/master/swifter/swifter.py#L357
Modin was removed as part of the axis=1 applies and you should see progress bars for axis=1 string applies again :)
Hi!
I've been using swifter for a while as I'm working on an ETL process where I need to handle huge dataframes.
I was used to seeing the progress bar when I used swifter.apply(), but it hasn't appeared for a while. I'm sharing code through a repository, but that shouldn't be a problem, should it?
Maybe the progress bar has been deprecated in later swifter versions?
I'm using swifter just like this with the latest version (1.0.7):
df = df.swifter.apply(lambda row: custom_function(row), axis=1)