arrow and dplyr agree on the answer, but duckdb is missing a single supplier Supplier#000048933, which looks like it meets the criteria, but running the query in DuckDB does not return it.
# This looks eligible for partkey 1998894, supplier key 48933
sub_sql <- "
SELECT
*
FROM
lineitem
INNER JOIN
part
ON l_partkey = p_partkey
WHERE
p_name LIKE 'forest%'
AND l_suppkey = 48933
AND l_shipdate >= CAST('1994-01-01' AS date)
AND l_shipdate < CAST('1995-01-01' AS date)
"
result_duckdb <- as_tibble(dbGetQuery(con, sub_sql))
> result_duckdb
A tibble: 2 × 25
l_orderkey l_partkey l_suppkey l_linenumber l_quantity l_extendedprice
<int> <int> <int> <int> <int> <dbl>
1 48224898 1998894 48933 2 35 69748
2 14710885 1998894 48933 4 45 89676
# … with 19 more variables: l_discount <dbl>, l_tax <dbl>, l_returnflag <chr>,
# l_linestatus <chr>, l_shipdate <date>, l_commitdate <date>,
# l_receiptdate <date>, l_shipinstruct <chr>, l_shipmode <chr>,
# l_comment <chr>, p_partkey <int>, p_name <chr>, p_mfgr <chr>,
# p_brand <chr>, p_type <chr>, p_size <int>, p_container <chr>,
# p_retailprice <dbl>, p_comment <chr>
arrow and dplyr agree on the answer, but duckdb is missing a single supplier Supplier#000048933, which looks like it meets the criteria, but running the query in DuckDB does not return it.