Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

basilisk #1318

Closed LTLA closed 4 years ago

LTLA commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

0167f7b Bugfix for example.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

a691191 Destroy past Anaconda installations during Windows... df3fb75 Removed unnecessary OS detection utility.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

LTLA commented 4 years ago

This is done @hpages, I'm not going to make any more changes unless you tell me to.

hpages commented 4 years ago

Have you solved the binary problem?

LTLA commented 4 years ago

If you're talking about our previous discussions about creating tarballs on Macs, it seems that is a non-issue on the BioC build machines. The celaya2 BIN builds go through fine without timing out or even giving a warning about too-long file paths. Which is a pleasant surprise.

Of course, Windows has its own set of problems. We can't even put the Anaconda installation in there because of the file path length issue. So we shift the entire installation into a user cache, and BUILD BIN doesn't even have to care about trying to portably package it up.

hpages commented 4 years ago

So IIUC on Mac the user could just install the binary and be good to go?

How would things look from an end-user perspective on Windows?

LTLA commented 4 years ago

So IIUC on Mac the user could just install the binary and be good to go?

Yes, in theory. Haven't actually tried it.

How would things look from an end-user perspective on Windows?

It should still be seamless, but the Anaconda installation happens on the user's side, so there's more opportunity for things to go wrong. As it currently stands, we can't provide the binary in this case.

hpages commented 4 years ago

As it currently stands, we can't provide the [Windows] binary in this case.

But the package takes 1/2 hour to install from source on Windows. Are we sure we are ok with that?

Also it would be good to test how things will go for basilisk's client packages. Can you submit a toy basilisk's client package as an additional package to this issue?

Thanks!

LTLA commented 4 years ago

But the package takes 1/2 hour to install from source on Windows. Are we sure we are ok with that?

I don't know. It certainly doesn't take half an hour on any other system! But I'm out of options, unless you're willing to change the temporary CHECK and BUILD locations on windows so that the Anaconda paths fit in under 260 characters.

Also it would be good to test how things will go for basilisk's client packages. Can you submit a toy basilisk's client package as an additional package to this issue?

The tests already contain an installation of son.of.basilisk, a test client that is included in the inst subdirectory. This seems to work out alright on all systems.

hpages commented 4 years ago

unless you're willing to change the temporary CHECK and BUILD locations on windows so that the Anaconda paths fit in under 260 characters

This is a significant change to the build system so not something I'm eager to try unless we have some solid evidences that it will actually work and produce a valid self-contained binary.

One important question is whether the Windows and Mac binaries are valid self-contained binaries. Our builds (SPB or BBS) don't check this. Someone would need to install/test these binaries on pristine Windows and Mac systems. Do you have access to a pristine Mac where you can do this sort of testing?

hpages commented 4 years ago

One thing I notice is that some tests fail if the package installation folder is read-only:

test_check("basilisk")
# ══ testthat results  ═════════════════════════════════════════════════
# [ OK: 55 | SKIPPED: 0 | WARNINGS: 4 | FAILED: 3 ]
# 1. Failure: internal test package installs correctly (@test-package.R#8) 
# 2. Failure: internal test package installs correctly (@test-package.R#11) 
# 3. Error: internal test package installs correctly (@test-package.R#14) 

(I tried this on merida1 only.)

Are the tests trying to install stuff inside basilisk's installation folder? Except during installation of a package, the package installation folder should not be touched.

Setting back write permission on basilisk's installation folder allows the tests to run successfully. However I notice that they installed (and didn't remove) son.of.basilisk in R's package library folder. The tests should assume that the package library is a read-only folder so should install son.of.basilisk in a temporary folder. Another benefit of doing this is that then the tests won't need to clean after themselves so you'll kill 2 birds with a single stone.

LTLA commented 4 years ago

Do you have access to a pristine Mac where you can do this sort of testing?

Not directly, but I could ask a colleague. Do you have a link to a binary?

Are the tests trying to install stuff inside basilisk's installation folder? Except during installation of a package, the package installation folder should not be touched.

Interesting. I was just using whatever devtools::install() was using, and assuming that it was doing something reasonable. I guess it wasn't; it seems I probably need to set build=TRUE.

hpages commented 4 years ago

You can build the Mac binary with:

build-universal.sh <path/to/source/tarball>

The build-universal.sh script is in the utils folder of the BBS repo: https://github.com/Bioconductor/BBS/blob/master/utils/build-universal.sh

Note that the script needs to be able to call chown-rootadmin which is a small executable that you need to generate by compiling a little C program called chown-rootadmin.c (found in the same folder as build-universal.sh). See comment at the top of chown-rootadmin.c for how to compile the program and set the owner, group, and permissions of the chown-rootadmin executable.

It would be great if you could also get access to a Windows machine to investigate the feasibility of a self-contained Windows binary. I am under the impression that a source-only solution on Windows would mean that all basilik's clients would also need to be made available as source packages only. Is that correct? This is not viable. Not only because of the very long time it takes to install all this stuff from source on Windows but most importantly because of the toolchain and possibly other external deps that are required for this and that the end users typically don't have.

I was just using whatever devtools::install() was using, and assuming that it was doing something reasonable.

I think it would still be valuable that you submit a toy basilisk's client package as an additional package to this issue as it is more likely to reflect what will really happen with client packages (e.g. the SPB and BBS don't use devtools::install() to install packages and Bioconductor users are not supposed to do this either).

Thanks!

LTLA commented 4 years ago

I am under the impression that a source-only solution on Windows would mean that all basilik's clients would also need to be made available as source packages only. Is that correct?

This should not be necessary. AFAIK, the Anaconda installer does not require any of the Windows toolchain to do its job. No compilation actually happens (again, AFAIK), the installer just unpacks binary content into executables/libraries that can be immediately run. This is similarly true of client packages, conda just grabs prebuilt binaries and plonks them into a new conda environment.

Not only because of the very long time it takes to install all this stuff from source on Windows

I have no solution here, but it seems like it's a choice between "takes a long time" and "impossible".

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

0e4f468 Minor clarifications to vignette. 1b54911 Shifted internal pkg tests to avoid actually insta... 815920e Merge branch 'master' of https://github.com/LTLA/b... 05f3395 Fixed DESCRIPTION of test package. 54c4855 With hindsight, removed the ill-advised persistent... 91308a7 Reoxygenated, bumped version and date.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

0424e65 Remove obsolete envvars, get better debugging mess...

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

52b9fbf Force Anaconda to properly reinstall on Windows.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

ca5b03c Yet another fix for Windows installation path.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

LTLA commented 4 years ago

Sigh. As usual, the Windows error reports are mostly useless. #1346 would be instructive.

hpages commented 4 years ago

Please submit basilisk's client package as an additional package to this issue. This should work the same way as when people submit a software package + a companion data package.

the Anaconda installer does not require any of the Windows toolchain to do its job

I understand. However basilisk's clients will sometimes need Python modules that are not available in basilisk. With the current solution where the Windows users need to install basilisk from source and Anaconda installation happens on the user's side, it seems to me that they would also need to install the client from source. Or are you saying that the client could be made available as a Windows binary that contains the Python modules and that installing this binary would somehow add the Python modules to the Anaconda installation?

My concern about the Windows toolchain is just a general concern regarding a solution where we provide source-only packages on Windows. In addition to requiring extra Python modules, a basilisk's client could also contain native code and thus require that the end user has the Windows toolchain on their machine, which is not an option.

I have no solution here, but it seems like it's a choice between "takes a long time" and "impossible".

I thought a 3rd solution was to change the temporary CHECK and BUILD locations on Windows so that the Anaconda paths fit in under 260 characters, and that this would allow the generation of a valid self-contained Windows binary? Did I miss something?

LTLA commented 4 years ago

AdditionalPackage: https://github.com/LTLA/son.of.basilisk

bioc-issue-bot commented 4 years ago

Dear @LTLA ,

You (or someone) has already posted that repository to our tracker.

See https://github.com/Bioconductor/Contributions/issues/1346

You cannot post the same repository more than once.

If you would like this repository to be linked to issue number: 1318, Please contact a Bioconductor Core Member.

LTLA commented 4 years ago

Argh. Well, okay, in the meantime:

Or are you saying that the client could be made available as a Windows binary that contains the Python modules and that installing this binary would somehow add the Python modules to the Anaconda installation?

Yes. During the installation of the client on the user machines, it should pull down the Windows binaries for the required Python packages from conda. Everything is pre-built by conda (or pip), there's no need for any compilation anywhere.

If the client has native code, that's fine, it's independent of the Python installation process. So we should be able to process the usual parts of the R package with standard methods.

I thought a 3rd solution was to change the temporary CHECK and BUILD locations on Windows so that the Anaconda paths fit in under 260 characters, and that this would allow the generation of a valid self-contained Windows binary? Did I miss something?

I thought you said you didn't want to do that, but if you're up for it, then sure.

hpages commented 4 years ago

What I said is that doing this now on tokay2 is a significant change to the build system and so is not something I'm eager to try unless we have some solid evidences that it will actually work and produce a valid self-contained binary. Which is why I suggested that you get access to a Windows machine to investigate the feasibility of a self-contained Windows binary. Once we know this works, we can consider changing the temporary CHECK and BUILD locations on our Windows builders.

During the installation of the client on the user machines, it should pull down the Windows binaries for the required Python packages from conda.

How could installation of the basilisk's binary client on the user machine pull anything down besides the binary client itself? Clearly I'm still missing some important part of the puzzle.

LTLA commented 4 years ago

How could installation of the basilisk's binary client on the user machine pull anything down besides the binary client itself?

During installation of the client, it will use basilisk's conda executable to connect to Anaconda's servers and pull down binary Python packages to populate the client's conda environment. Well, that's the theory, anyway.

Which is why I suggested that you get access to a Windows machine to investigate the feasibility of a self-contained Windows binary.

I don't have one, which is why I'm playing this guessing game with the SPB.

I guess it would be... possible to get requisition one. But if the current solution works (and hey, installation time is down to several minutes in the latest build), then I am not inclined to put in any more effort to make life easier for Windows users. It has been very tiring.

hpages commented 4 years ago

During installation of the client, it will use basilisk's conda executable to connect to Anaconda's servers and pull down binary Python packages to populate the client's conda environment. Well, that's the theory, anyway.

Really? How? No code gets executed when you install a binary. The zip file gets downloaded and unzipped and that's all. install.packages() doesn't even try to load the package like it does when installing from source so there is no safeguards against installing a package that can't even be loaded.

I am not inclined to put in any more effort to make life easier for Windows users.

Tss, tsst... let's be nice with them ;-)

That's fine with me if the current solution indeed "works". That is:

LTLA commented 4 years ago

No code gets executed when you install a binary.

Okay, I didn't know that. But it matters not. The Python installation instructions are all stuffed into the .onLoad(), so they get run every time the package is loaded. If the Anaconda installation already exists, then it's a no-op, but if it doesn't, the package will instantiate it. This is true of both the basilisk and of any of its clients, and it should all work out on first actual use; basilisk will be loaded first and trigger the installation of the core, then the client will be loaded and trigger installation of its required environments.

  • So maybe the current source-only situation on Windows saves you some effort now but it will almost certainly be more effort in the long run. Just saying...

I know. But the burden of finding a Windows machine is just too much. I would rather just set Windows to be an unsupported platform and be done with it.

LTLA commented 4 years ago

In any case, if you can clear the son.of.basilisk cache, I can at least solve the immediate problem.

hpages commented 4 years ago

In any case, if you can clear the son.of.basilisk cache, I can at least solve the immediate problem.

Hopefully @lshep can help with this. Lori?

The Python installation instructions are all stuffed into the .onLoad()

Where do Python and its modules get installed? Remember that you cannot touch a package installation folder after a package has been installed (as previously discussed).

But the burden of finding a Windows machine is just too much. I would rather just set Windows to be an unsupported platform and be done with it.

Not an option here, unfortunately. basilisk is an infrastructure package!

Bioconductor has always taken Windows support seriously. I understand and sympathize with your reluctance to get your hands dirty on Windows though. If it was just me, Bioconductor would support Linux only and I would spend 10x less time on the build system. Anyway, not a choice. basilisk needs to be fully supported on Windows.

lshep commented 4 years ago

By clear the cache you mean do the magic on our end so you can add it as an additional package to this issue?

LTLA commented 4 years ago

@lshep Yes please. See what the @bioc-issue-bot said above.

Where do Python and its modules get installed? Remember that you cannot touch a package installation folder after a package has been installed (as previously discussed).

For windows, in rappdirs::user_cache_dir(). For everywhere else, it's inside the installation directory, but that's fine because we don't need these funky hacks for other OS's.

I understand and sympathize with your reluctance to get your hands dirty on Windows though.

It's not even that. I literally do not have access to a Windows box. I can't get dirty even if I wanted to. Requisitioning a physical or virtual machine would take... well, who knows. Maybe this quarter.

lshep commented 4 years ago

I think it should be good if you wanted to try the AdditionalPackage line again.

LTLA commented 4 years ago

AdditionalPackage: https://github.com/LTLA/son.of.basilisk

bioc-issue-bot commented 4 years ago

Hi @LTLA,

Starting build on additional package https://github.com/LTLA/son.of.basilisk.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your additional package repository will NOT trigger a new build.

The DESCRIPTION file of this additional package is:

Package: son.of.basilisk
Version: 0.0.1
Title: Son of Basilisk
Authors@R:
person("Aaron", "Lun", email="infinite.monkeys.with.keyboards@gmail.com",
    role=c("aut", "cre"))
License: GPL-3
Description: Provides an example of how to set up a 
package that depends on the basilisk package.
Imports:
basilisk,
reticulate
Suggests:
testthat
RoxygenNote: 7.0.2
StagedInstall: no
bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

53155aa Bumped version number.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

71222c6 Disabled yet more tests on Windows.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

0ef7520 Bumped version number.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

hpages commented 4 years ago

For everywhere else, it's inside the installation directory, but that's fine because we don't need these funky hacks for other OS's.

The onLoad hook is the wrong place to run code that touches the package installation folder. No matter the OS. Saying it's fine to do it because we're hoping that this code will never be triggered once installation of the source package has completed is like saying it's fine to drive on the left side of the road because there are no cars coming in the opposite direction. Until there is one. Like for example when some system admin decides to use --no-test-load to install Bioconductor on a read-only partition.

Why not use the standard mechanism for that e.g. configure and configure.win scripts?

I literally do not have access to a Windows box.

I get that. Problem is that you're not showing much interest in trying to change that.

LTLA commented 4 years ago

Why not use the standard mechanism for that e.g. configure and configure.win scripts?

At the time, I had a bunch of R code that I didn't want to have to wrap in R --no-save -e. I guess I could do it now that I have a better idea of what is being called and when.

Problem is that you're not showing much interest in trying to change that.

Not sure what I'm meant to do. I'm not going to buy my own Windows machine, and I'd rather not use company resources for my personal side projects.