basilisp-lang / basilisp

A Clojure-compatible(-ish) Lisp dialect targeting Python 3.8+
https://basilisp.readthedocs.io
Eclipse Public License 1.0
262 stars 7 forks source link

How to read a property at `[index, string]`? #1092

Open johnjelinek opened 4 days ago

johnjelinek commented 4 days ago

Given the following example:

# importing pandas module 
import pandas as pd 

# reading csv file from url 
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv") 

# creating position and label variables
position = 2
label = 'Name'

# calling .at[] method
output = data.at[position, label]

# display
print(output)

How do I call the .at[position, label] part?

I tried:

(def data (pd/read_csv "https://media.geeksforgeeks.org/wp-content/uploads/nba.csv"))
(-> data .iloc (subvec 0 3))
;;=>             Name            Team  Number Position   Age Height  Weight            College     Salary
;;   0  Avery Bradley  Boston Celtics     0.0       PG  25.0    6-2   180.0              Texas  7730337.0
;;   1    Jae Crowder  Boston Celtics    99.0       SF  25.0    6-6   235.0          Marquette  6796117.0
;;   2   John Holland  Boston Celtics    30.0       SG  27.0    6-5   205.0  Boston University        NaN

(-> data (.at (list 2 "Name")))
  ;;=> Traceback (most recent call last):
  ;;   TypeError: '_AtIndexer' object is not callable 

(.- (.at data) (list 2 "Name"))
  ;;=> 
  ;;     exception: <class 'basilisp.lang.compiler.exception.CompilerException'>
  ;;         phase: :analyzing
  ;;       message: host interop field must be a symbol
  ;;          form: (.- (.at data) (list 2 "Name"))
johnjelinek commented 4 days ago

Seems like maybe a workaround could be:

(->> data .loc (drop 2) (take 1) (map #(.-Name %)) first)
  ;;=> "John Holland"
ikappaki commented 4 days ago

HI @johnjelinek,

this would be (aget (.-at data) #py (2 "Name"))

> basilisp repl
basilisp.user=> (import [pandas :as pd]) nil
nil

basilisp.user=> (def data (pd/read_csv "https://media.geeksforgeeks.org/wp-content/uploads/nba.csv"))
                (-> data .iloc (subvec 0 3))
            Name            Team  Number Position   Age Height  Weight            College     Salary
0  Avery Bradley  Boston Celtics     0.0       PG  25.0    6-2   180.0              Texas  7730337.0
1    Jae Crowder  Boston Celtics    99.0       SF  25.0    6-6   235.0          Marquette  6796117.0
2   John Holland  Boston Celtics    30.0       SG  27.0    6-5   205.0  Boston University        NaN

basilisp.user=> (aget (.-at data) #py (2 "Name"))
"John Holland"

Please also have a look at https://github.com/ikappaki/basilisp-kernel/blob/main/notebooks/pandas-03-select.ipynb where I replicated some examples from the pandas getting started guide for the Basilisp Kernel.

Pandas makes heavy use of overloaded indexing operators, including slice's, that don't map one to one directly to Basilisp aget/aset primitives. It is in my bucket list to discuss this with @chrisrink10 at some point to see if he could think of a way to facilitate this type of indexing at the language level ...

I hope this helps

johnjelinek commented 4 days ago

Thanks, is there another way to write the following?

#py (2 "Name")

I tried:

(lisp->py (list 2 "Name"))
;;=> #py [2 "Name"]

and that didn't pass into aget correctly

ikappaki commented 4 days ago

The #py () reader syntax is a python tuple.

Another way to write this is to use the tuple built-in fn:

basilisp.user=> (def t (tuple [2 "Name"]))
#'basilisp.user/t

basilisp.user=> t
#py (2 "Name")

basilisp.user=> (type t)
<class 'tuple'>

basilisp.user=> (aget (.-at data) (tuple [2 "Name"]))
"John Holland"
FalseProtagonist commented 17 hours ago

can also do (import [operator :as py-op]) (py-op/getitem x (python/tuple [a b])) also often need (python/slice nil)