ankane / ruby-polars

Blazingly fast DataFrames for Ruby
Other
858 stars 34 forks source link

Enhanced `drop` Method to Gracefully Handle Non-Existent Columns #61

Closed mezbahalam closed 7 months ago

mezbahalam commented 7 months ago

This PR introduces enhancements to the drop method in the DataFrame class. The method now checks if a column exists before attempting to drop it. If the column does not exist, a warning message is printed instead of raising an error. This makes the method more robust and user-friendly, as it can handle attempts to drop non-existent columns gracefully.

Changes include:

This PR does not introduce any breaking changes, as the updated drop method maintains backward compatibility.

ankane commented 7 months ago

Hi @mezbahalam, thanks for the PR. It looks like silently ignoring non-existent columns is the latest behavior in Python, so happy to update to that.

import polars as pl

df = pl.DataFrame({'a': [1, 2, 3]})
df.drop('b')

However, please make sure the code matches py-polars for maintainability (also, this is a breaking change, but the next version will likely be 0.10.0).

mezbahalam commented 7 months ago

hi, @ankane , thanks for getting back to me so quickly. I'm on board with making our code match py-polars. But, I'm not sure if they've gone for a silent approach when dealing with this. I found a issue; here's the link to check it out.

ankane commented 7 months ago

If you run the Python code above, it doesn't have any output (similar to Hash#delete on a non-existent key in Ruby).

mezbahalam commented 7 months ago

hi @ankane If we attempt to drop multiple columns and one of those columns does not exist, how should the method behave? Would you prefer the drop method to return nil in this scenario?

ankane commented 7 months ago

It should behave the same as the Python library (relevant code).

mezbahalam commented 7 months ago

@ankane in my last commit, I made our code work just like the py-polars drop method. When you have a moment, could you check it out? If you notice anything off, please let me know!

ankane commented 7 months ago

The code still looks very different than py-polars (see the link in my last comment).

mezbahalam commented 7 months ago

@ankane what are your thoughts on this code?

    def drop(*columns)
      df = clone
      columns.flatten.each do |column_name|
        df._df.drop_in_place(column_name) if df._df.columns.include?(column_name)
      end
      df
    end
ankane commented 7 months ago

Sorry @mezbahalam, I won't be able to help more with this, but will plan to update the behavior for 0.10.0.