TidierOrg / Tidier.jl

Meta-package for data analysis in Julia, modeled after the R tidyverse.
MIT License
515 stars 14 forks source link

Implementing join functions #12

Closed zhezhaozz closed 1 year ago

zhezhaozz commented 1 year ago

I use this draft PR for self-learning purposes. Do NOT merge this PR.

Implementing join functions, starting with left_join(df1, df2, by). The goal is listed by the following:

zhezhaozz commented 1 year ago

left_join now can be passed with expression to by argument. For example:

left_join(df1, df4, by=:(id = zid))
left_join(df1, df5, by=:([id = zid, pid = fid]))
zhezhaozz commented 1 year ago

@left_join can take the following arguments

@left_join(df1, df3, "id")
@left_join(df1, df2, ["id", "pid"])
@left_join(df1, df4, id == zid)
@left_join(df1, df5, [id == zid, pid == fid])

The parse_join_by is the function to parse the expression from by argument, without any string manipulation.

Next steps:

  1. Allow by to be Nothing;
  2. Add a parse_join parsing engine to handle more DataFrames.leftjoin arguments;
  3. Add test cases
kdpsingh commented 1 year ago

Implemented a different way of doing this in PR #30.