apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.24k stars 3.47k forks source link

[R] CRAN packaging checklist for version 17.0.0 #43317

Open jonkeane opened 1 month ago

jonkeane commented 1 month ago

Describe the enhancement requested

Packaging checklist for CRAN release

For a high-level overview of the release process see the Apache Arrow Release Management Guide.

cc @jonkeane @amoeba @nealrichardson @assignUser @paleolimbot for coordination on the various tasks to release

We will want to cherry pick:

Before the release candidate is cut

Wait for the release candidate to be cut:

After release candidate has been cut

Prepare and check the .tar.gz that will be released to CRAN.

Release vote

Generate R package to submit to CRAN

Ensure linux binary packages are available:

Check binary Arrow C++ distributions specific to the R package

CRAN submission

Wait for CRAN...

Component(s)

R

Component(s)

R

jonkeane commented 1 month ago

@assignUser Did you use Rscript tools/update-checksums.R <libarrow version> in the past to generate the contents of https://github.com/apache/arrow/commit/12ec842516f8dc9503dc7eb7bf79286ca2468ef3 ?

assignUser commented 1 month ago

Yes :) Let me know if I can help with anything!

amoeba commented 1 month ago

I added the NEWS.md entry from https://github.com/apache/arrow/pull/43189 to the release blog post PR.

assignUser commented 1 month ago

@jonkeane mac builder: https://mac.r-project.org/macbuilder/results/1721447405-da61deb72185bfbc/ win builder should reach you by mail, if that's clean I think only a revdep check is missing. @paleolimbot didn't you have a super quick script that doesn't take a million hours to run?

jonkeane commented 1 month ago

Thanks for sending those, Jacob.

Here's the winbuilder output: https://win-builder.r-project.org/SrOKYg8yT1lm/

https://win-builder.r-project.org/SrOKYg8yT1lm/00check.log

The only thing in there that looks worrying is

* checking R code for possible problems ... [33s] NOTE
mutate.ArrowTabular: no visible global function definition for
  'left_join'
mutate.Dataset: no visible global function definition for 'left_join'
mutate.RecordBatchReader: no visible global function definition for
  'left_join'
mutate.arrow_dplyr_query: no visible global function definition for
  'left_join'
Undefined global functions or variables:
  left_join
* checking Rd files ... OK

Which I suspect is from https://github.com/apache/arrow/pull/41576/files and can be dealt with by adding dplyr:: onto those calls again

amoeba commented 1 month ago

I brought in the changes from maint-17.0.0-r (see my main branch) and kicked off two recheck jobs (1, 2).

Edit: I need to tweak recheck a bit to run so the linked jobs failed right away. I'll update them once recheck is running. Edit 2: Updated URLs.

nealrichardson commented 1 month ago

Which I suspect is from https://github.com/apache/arrow/pull/41576/files and can be dealt with by adding dplyr:: onto those calls again

Sorry! I thought I fixed those but I guess I added more. LMK if you need me to make a PR.

jonkeane commented 1 month ago

No problem, I've got one at https://github.com/apache/arrow/pull/43348

I'll merge that if CI shows it gets rid of the note (and there are no other bad consequences) then cherrypick that back to the maintenance branch.

This isn't a here-and-now problem, but boy do I wish we could say "these notes we don't care about but all the others we do" for things like this 🤦

jonkeane commented 1 month ago

@amoeba am I reading these logs from recheck correctly that with new arrow parquetize is failing in this example:

> ### Name: csv_to_parquet
> ### Title: Convert a csv or a txt file to parquet format
> ### Aliases: csv_to_parquet
> 
> ### ** Examples
> 
> 
> # Conversion from a local csv file to a single parquet file :
> 
> csv_to_parquet(
+   path_to_file = parquetize_example("region_2022.csv"),
+   path_to_parquet = tempfile(fileext=".parquet")
+ )
Reading data...
Writing data...
Error in Table__from_dots(dots, schema, option_use_threads()) : 
  No Set_elt found for ALTSTRING class [class: vroom_chr, pkg: vroom]
Calls: csv_to_parquet ... as_arrow_table.data.frame -> <Anonymous> -> Table__from_dots
Reading data...
Execution halted
amoeba commented 1 month ago

Hey @jonkeane, I think that's the right interpretation.

jonkeane commented 1 month ago

Ah, that is a regression on us: #43349 I need to step away for a bit, in case someone wants to investigate what's up there. I don't think much has changed with our altrep code, but this could be from our non-api avoidance changes (or possibly the change we made to avoid executing functions while reading data).

jonkeane commented 1 month ago

Ok, I've merged the fix for the altrep thing that was hitting parquetize. @amoeba would you mind triggering the revdep checks again?

jonkeane commented 1 month ago

I just got a note about errors from lz4 on CRAN's devel debian runners.

With current Debian testing, installation now fails, see the install log. Apparently this is from yesterday's system upgrade of

liblz4-dev:amd64 (1.9.4-2, 1.9.4-3), liblz4-1:amd64 (1.9.4-2, 1.9.4-3)

I can confirm that ARROW_WITH_LZ4=OFF ./configure works where ./configure now fails.

We will need to fix this as will in our 17 submission (and get it in within two weeks too).

We are linking against the shared, system lz4 which is fine in this case because CRAN does not distribute these binaries. But one possibility is to force non-shared lz4 and it'll grab source + build.

cc @assignUser @nealrichardson

nealrichardson commented 1 month ago

We can work around that, sure, but if that's failing, then it sounds like there's something off in our cmake routine in this case, and that should get addressed. For example, I see the message from this line in the logs: https://github.com/apache/arrow/blob/main/cpp/cmake_modules/ThirdpartyToolchain.cmake#L352 But I don't see either of the messages that should follow, about whether lz4 was found or not.

jonkeane commented 1 month ago

For example, I see the message from this line in the logs: https://github.com/apache/arrow/blob/main/cpp/cmake_modules/ThirdpartyToolchain.cmake#L352 But I don't see either of the messages that should follow, about whether lz4 was found or not.

When you say you don't see either of the messages that follow are you talking about https://github.com/apache/arrow/blob/48782e75708157f2296ab95699239207c5588871/cpp/cmake_modules/ThirdpartyToolchain.cmake#L354 or https://github.com/apache/arrow/blob/48782e75708157f2296ab95699239207c5588871/cpp/cmake_modules/ThirdpartyToolchain.cmake#L358 ? If so, that first one (line 354) is actually where the log line that you're seeing is emitted, I believe. The line you linked 352 is adding to the string, but the part that starts with Using ... is emitted at line 354 (so we do get there!)

jonkeane commented 1 month ago

@kou might have some ideas + help here too (sorry I should have tagged you on that original comment)

nealrichardson commented 1 month ago

Ah my bad, you are right.

kou commented 1 month ago

https://github.com/apache/arrow/pull/43468 will solve the problem. I'll complete it soon.

assignUser commented 1 month ago

Ah thanks @kou! I was just exploring the lz4 that comes with debian:testing you are right: foreach(_cmake_expected_target IN ITEMS LZ4::lz4_shared LZ4::lz4_static)

jonkeane commented 1 month ago

Thank you (again) @kou !

amoeba commented 1 month ago

Revdepchecks are running at https://github.com/amoeba/arrow/actions. It looks like one already failed but I don't have enough signal to check right now (will tomorrow).

jonkeane commented 1 month ago

FYI, I got another email from CRAN preponing the deadline to ~today since CRAN shuts down for two weeks tomorrow 😬

jonkeane commented 1 month ago

~Weird, I see ^^^ in the logs https://github.com/amoeba/arrow/actions/runs/10172857322/job/28137640795 but when I download the archives I don't see any issues with either of those in the actual logs.~

Ugh, nevermind, I was looking at the wrong thing! HMMM I see the same issue for parquetize now that I'm looking at the right thing.

Is it possible you haven't pulled the latest updates from https://github.com/apache/arrow/tree/maint-17.0.0-r ?

As for the tidyquery failure, maybe @ianmcook could help figure out what's up with that??

Package: tidyquery 0.2.4
Check: tests, Result: ERROR
    Running ‘testthat.R’
  Running the tests in ‘tests/testthat.R’ failed.
  Last 13 lines of output:
      6. ├─tidyquery::query("SELECT origin, dest,\n          COUNT(flight) AS num_flts,\n          round(SUM(seats)) AS num_seats,\n          round(AVG(arr_delay)) AS avg_delay\n      FROM flights_at f LEFT OUTER JOIN planes_at p\n        ON f.tailnum = p.tailnum\n      WHERE distance BETWEEN 200 AND 300\n      AND air_time IS NOT NULL\n      GROUP BY origin, dest\n      -- HAVING num_flts > 3000\n      ORDER BY num_seats DESC, avg_delay ASC\n      LIMIT 100;")
      7. │ └─tidyquery:::query_(data, sql, TRUE)
      8. │   └─out %>% verb(filter, !!(tree$where[[1]]))
      9. ├─tidyquery:::verb(., filter, !!(tree$where[[1]]))
     10. │ └─input$data %>% fun(...)
     11. ├─dplyr (local) fun(., ...)
     12. └─arrow:::filter.arrow_dplyr_query(., ...)
     13.   ├─arrow:::try_arrow_dplyr(...)
     14.   │ └─base::evalq(call <- match.call(), parent)
     15.   │   └─base::evalq(call <- match.call(), parent)
     16.   └─base::match.call()

    [ FAIL 9 | WARN 0 | SKIP 4 | PASS 226 ]
    Error: Test failures
    Execution halted
jonkeane commented 1 month ago

I've also tried to replicate the CRAN devel image locally (I'll push a PR up with that for CI after this fire drill). After Kou's change, we get past LZ4. But then it seems to fail with the following. This isn't related directly to the LZ4 change I don't think, and as far as I could tell on github that abseil code hasn't changed in years. This could be a false positive, since I don't have an exact replica of the CRAN runner, but wonder if anyone has any ideas about what this might be.

[ 55%] Building CXX object absl/strings/CMakeFiles/str_format_internal.dir/internal/str_format/extension.cc.o

-- stderr output is:
...skipping to end...
230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:3: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |   ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:18: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                  ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:33: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                                 ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:48: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                                                ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:169:3: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  169 |   X_VAL(n) X_SEP X_VAL(p)
      |   ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:169:18: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  169 |   X_VAL(n) X_SEP X_VAL(p)
      |                  ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:49:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   49 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kStar;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:51:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   51 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kIntegral;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:53:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   53 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kFloating;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:55:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   55 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kNumeric;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:57:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   57 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kPointer;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
make[5]: *** [absl/strings/CMakeFiles/str_format_internal.dir/build.make:104: absl/strings/CMakeFiles/str_format_internal.dir/internal/str_format/extension.cc.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [CMakeFiles/Makefile2:2382: absl/strings/CMakeFiles/str_format_internal.dir/all] Error 2
make[3]: *** [Makefile:136: all] Error 2
amoeba commented 1 month ago

Whoops, sorry about that @jonkeane. New revdep check runs are now here: strong, most. Those are being run from the latest maint-17.0.0-r. They seem to be stuck at the moment so hopefully they'll start soon.

amoeba commented 1 month ago

https://github.com/apache/arrow/issues/36969 looks similar to your above output, maybe this is the same cause?

jonkeane commented 1 month ago

@assignUser Do you know if we will need to re-generate the Apache-hosted binaries to include https://github.com/apache/arrow/pull/43468 and https://github.com/apache/arrow/pull/43157 and use those in this CRAN release? Or are we ok without those in the binaries since those are both build tools only?

jonkeane commented 1 month ago

Looks like tidyquery is still an issue (I'm surprised we didn't catch this the first time we ran through those!)

I'm seeing a bunch of errors like this one. Which I suspect are coming from https://github.com/apache/arrow/blob/7e50097ba4239cf9368b77f438d877d4176141c9/r/R/dplyr-eval.R#L204

✖ | 3    1   1 | arrow-dataset                                                                    
──────────────────────────────────────────────────────────────────────────────────────────────────
Error (test-arrow-dataset.R:3:3): Simple SELECT example query #1 returns expected result on Arrow Dataset
Error in `match.call()`: ... used in a situation where it does not exist
Backtrace:
     â–†
  1. ├─testthat::expect_equal(...) at test-arrow-dataset.R:3:3
  2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
  3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
  4. ├─... %>% as.data.frame()
  5. ├─base::as.data.frame(.)
  6. ├─tidyquery::query("SELECT Species, COUNT(*) AS n FROM iris_ad GROUP BY Species")
  7. │ └─tidyquery:::query_(data, sql, TRUE) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:77:3
  8. │   └─... %>% verb(ungroup) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:275:7
  9. ├─tidyquery:::verb(., ungroup)
 10. │ └─base::paste(...) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:450:3
 11. ├─tidyquery:::verb(...)
 12. │ └─input$data %>% fun(...) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:450:3
 13. ├─dplyr (local) fun(., ...)
 14. └─arrow:::summarise.arrow_dplyr_query(., ...)
 15.   ├─arrow:::try_arrow_dplyr(...)
 16.   │ └─base::evalq(call <- match.call(), parent)
 17.   │   └─base::evalq(call <- match.call(), parent)
 18.   └─base::match.call()
assignUser commented 1 month ago

@jonkeane pretty sure they only affect the source build but I'll check

nealrichardson commented 1 month ago

Here's a workaround for the tidyquery failure: https://github.com/apache/arrow/pull/43498

assignUser commented 1 month ago

So the lz4 one changes our bundled find module so that could affect the precompiled binaries. But only in theory as we are using either pkg-config to find and set flags or have them hardcoded and don't use cmake.

So both only matter when building from source.

jonkeane commented 1 month ago

Thanks for all of the help, everyone. I just sent the submission, I'll keep this issue posted if there are other issues.

jonkeane commented 1 month ago

Womp womp. I received the not-passing-auto-checks email. The problem is that we are calling XXX on <4.2 and CRAN's checks for these apparently are static and not actually checking if we are truly calling them (on R versions that they run on).

I'm partially tempted to wait and see if they end up accepting it anyway after seeing what's going on. I could also reply and explain (though I have extremely low hopes of that landing, even though it is simply a fact that before 4.2 the approved thing does not exist!). @nealrichardson do I remember you having an idea on possibly obfuscating this call for our compatibility branch?

The only NOTE is:

* checking compiled code ... NOTE
File 'arrow/libs/x64/arrow.dll':
  Found non-API calls to R: 'Rf_findVarInFrame3', 'SETLENGTH',
    'SET_GROWABLE_BIT', 'SET_TRUELENGTH'

Compiled code should not call non-API entry points in R.

See 'Writing portable packages' in the 'Writing R Extensions' manual,
and section 'Moving into C API compliance' for issues with the use of
non-API entry points.

https://win-builder.r-project.org/incoming_pretest/arrow_17.0.0_20240801_011430/Windows/00check.log https://win-builder.r-project.org/incoming_pretest/arrow_17.0.0_20240801_011430/Debian/00check.log

nealrichardson commented 1 month ago

You could wait to see if Uwe waves it through, or you could reply and explain that these are ifdef'd for backwards compatibility only.

The in-case-of-emergency hack would be to sed out these references from the .cpp files in the configure script if R is > 4.4 (the checks are only on devel now, right?). But it seems like this is a more general problem CRAN will have to solve, if they're adding new "official" API functions to replace some of these things but they're only in R-devel, how can you be on CRAN if you depend on them?

What's surprising to me is that, at least by the internal function names check is calling, they're looking at the compiled object files, which I would expect not to have these symbols in them because of the #ifdef: https://github.com/wch/r-source/blob/032b8240be6c8cf5e8e85ecde6f3f23e46739716/src/library/tools/R/sotools.R#L721-L768 And that function is wrapping nm.

In fact, I do see these symbols showing up locally, on new enough R, for example:

npr@nealrichardson-3P29 r % nm src/arrow.so | grep Rf_findVarInFrame3
                 U _Rf_findVarInFrame3
nealrichardson commented 1 month ago

Of course CRAN is closed starting tomorrow (and it's already tomorrow CEST), so this is probably not something we can solve until they're back.

jonkeane commented 1 month ago

Hmmmm, yeah something strange is going on here. It's possible that the #IFDEF isn't the right thing for detecting functions (https://stackoverflow.com/questions/6234755/does-ifdef-or-other-preprocessor-directives-work-for-function-declarations-t) and instead we should do something like #if (R_VERSION >= R_Version(4, 2, 0)) buuuuut I actually still see _Rf_findvarInFrame3 even if I remove any calls (conditional or otherwise) from arrow_cpp11.h (which is also the only place that calls it in our code) 🫠 I do see it in cpp11 https://github.com/r-lib/cpp11/blob/1c9dbb65cc9279a404f035f2fca990b338c9f602/inst/include/cpp11/environment.hpp#L37-L38 so maybe that's where it's coming from?

jonkeane commented 1 month ago

Ah yes, I think it is, indeed, coming from cpp11's environment.hpp. I just tried removing all of our imports / references to that and only then is nm -v src/arrow.so | grep Rf_findVarInFrame3 clean.

We still need #43504 but will still have an issue on that for automatic checks until cpp11 fixes + releases their fix.

nealrichardson commented 1 month ago

We may need to vendor (and possibly patch) cpp11 for this release then.

nealrichardson commented 3 weeks ago

Need to cherry-pick https://github.com/apache/arrow/pull/43649 (ab432b1362208696e60824b45a5599a4e91e6301) into the CRAN release

jonkeane commented 2 weeks ago

Ok, I've cherry-picked the latest fix onto this branch (#43735). I'll resubmit later this evening — in case anyone has any objections + suggestions for other things.

jonkeane commented 2 weeks ago

I have resubmitted 17.0.0.1

jonkeane commented 2 weeks ago

Aaaand we have been accepted 🎉

Do we already have a draft social media post?

amoeba commented 2 weeks ago

I haven't seen a draft yet. @thisisnic do you want to take this on again?

thisisnic commented 2 weeks ago

I was taking this release "off" to focus on my other commitments, though would be happy to review it if someone else wanted to write something up?

amoeba commented 2 weeks ago

I can get it started later today, I'll drop a link in here for reviews when that's done.

amoeba commented 2 weeks ago

Draft social media content is here. I still need to generate some code images for the examples and tweak a bit of the language but please have a look to see if I missed any tweet-worthy highlights. I see binaries are almost all up on CRAN which is great.

thisisnic commented 2 weeks ago

Language here looks great, thanks @amoeba!

jonkeane commented 2 weeks ago

Thanks, this is really great! I'll also post it on linked in (though not as a series). I also took the liberty to add emoji in the places to make it engaging 😂

assignUser commented 2 weeks ago

We are missing a checkbox to move the r-univerese tag, I'll move the tag forward in a bit e: done

amoeba commented 1 week ago

Thanks @jonkeane and @thisisnic for taking a look at the social media doc. I accepted suggestions and added code screenshots to the Drive folder where appropriate, see https://drive.google.com/drive/folders/1DzGoZNHbiZXNca2oHWjpBtJrEti3W-zQ?usp=drive_link, so I think this is ready for me and @jonkeane to post when x86_64 binaries go live.