kenhanscombe / ukbtools

An R package to manipulate and explore UK Biobank data
https://kenhanscombe.github.io/ukbtools/
96 stars 26 forks source link

Add utility downloader #29

Open muschellij2 opened 4 years ago

muschellij2 commented 4 years ago

I added functions to download the utilities so that if ukbmd5 and such would be in your PATH, then Sys.which finds it, it will use that. Otherwise, it will download that binary file and run Sys.chmod so that it can run it as an executable. By default, it downloads them to the temporary directory to not violate CRAN policy.

For example:

file = "ukbXXXX.enc"
checksum = "XXXasdf4XXXX"
out = ukb_md5(file)
stopifnot(out == checksum)

out = ukb_unpack(file, key = keyfile)
muschellij2 commented 4 years ago

Any thoughts on merging this?

Also - do you think adding in the option to ukb_df to put in a CSV that has the withdraw eids?

kenhanscombe commented 4 years ago

Hey, sorry for the delay just looking at this now. Good idea. Mainly updating documentation and style. Will push commits to your branch shortly.

kenhanscombe commented 4 years ago

Could you separate "utilities" and "withdrawals" additions, preferrably to separate feature branches?

We may be able to incorporate the withdrawals additions before the utilities – I could not get noah to work on macOS 10.14.6 mojave. More generally I'm concerned that noah is archived and therefore no longer supported, and my problems with noah are in fact an unresolved issue #43.

I think I'm inclined to take the addition of utilies in a different direction: I've created a docker image which contains all the md5 checksum'd programs and utilities. It's seems far easier to bind mount to a local directory and ukbunpack, ukbconv, etc. using the linux container (regardless of local operating system). Yes, it will require users to install docker but that's straightforward and supported. Will push this functionality to a feature branch shortly.

Depending how well that works, we may still use your "direct" approach for linux and windows users.

muschellij2 commented 4 years ago

I don't have the time to separate them out right now. I think Docker is fine, and while I agree it's a straightforward install, I think 1) it's a heavy lift for most users, 2) if you're working with this data it's likely on a cluster (aka linux) and probably can't use docker (but probably can use singularity), but I agree that you lose out on macOS users (but UK Biobank made that decision really for you).

If you go down the Docker road, I think https://rdrr.io/cran/xaringan/man/decktape.html is a good example of it used in a package.

Best, John

On Tue, Sep 1, 2020 at 8:19 AM Ken Hanscombe notifications@github.com wrote:

Could you separate "utilities" and "withdrawals" additions, preferrably to separate feature branches?

We may be able to incorporate the withdrawals additions before the utilities – I could not get noah to work on macOS 10.14.6 mojave. More generally I'm concerned that noah is archived and therefore no longer supported, and my problems with noah are in fact an unresolved issue #43 https://github.com/linux-noah/noah/issues/43.

I think I'm inclined to take the addition of utilies in a different direction: I've created a docker image https://hub.docker.com/repository/docker/onekenken/alpine-ukb-progs which contains all the md5 checksum'd programs and utilities. It's seems far easier to bind mount to a local directory and ukbunpack, ukbconv, etc. using the linux container (regardless of local operating system). Yes, it will require users to install docker but that's straightforward and supported. Will push this functionality to a feature branch shortly.

Depending how well that works, we may still use your "direct" approach for linux and windows users.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kenhanscombe/ukbtools/pull/29#issuecomment-684811049, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLRYURNVOEWXYAEZGULSDTRGVANCNFSM4OFBDVDA .

muschellij2 commented 4 years ago

I guess my overall recommendation is to do an OS switch and use Docker for OSX users. I don't know what you'd want to do for Solaris and other OSes that CRAN supports, however. Best, John

On Tue, Sep 1, 2020 at 8:53 AM John Muschelli muschellij2@gmail.com wrote:

I don't have the time to separate them out right now. I think Docker is fine, and while I agree it's a straightforward install, I think 1) it's a heavy lift for most users, 2) if you're working with this data it's likely on a cluster (aka linux) and probably can't use docker (but probably can use singularity), but I agree that you lose out on macOS users (but UK Biobank made that decision really for you).

If you go down the Docker road, I think https://rdrr.io/cran/xaringan/man/decktape.html is a good example of it used in a package.

Best, John

On Tue, Sep 1, 2020 at 8:19 AM Ken Hanscombe notifications@github.com wrote:

Could you separate "utilities" and "withdrawals" additions, preferrably to separate feature branches?

We may be able to incorporate the withdrawals additions before the utilities – I could not get noah to work on macOS 10.14.6 mojave. More generally I'm concerned that noah is archived and therefore no longer supported, and my problems with noah are in fact an unresolved issue #43 https://github.com/linux-noah/noah/issues/43.

I think I'm inclined to take the addition of utilies in a different direction: I've created a docker image https://hub.docker.com/repository/docker/onekenken/alpine-ukb-progs which contains all the md5 checksum'd programs and utilities. It's seems far easier to bind mount to a local directory and ukbunpack, ukbconv, etc. using the linux container (regardless of local operating system). Yes, it will require users to install docker but that's straightforward and supported. Will push this functionality to a feature branch shortly.

Depending how well that works, we may still use your "direct" approach for linux and windows users.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kenhanscombe/ukbtools/pull/29#issuecomment-684811049, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLRYURNVOEWXYAEZGULSDTRGVANCNFSM4OFBDVDA .

kenhanscombe commented 4 years ago

Thanks, John.

I think you're right; we'll have to use an OS switch of sorts with docker for mac OS – singularity will work for linux but not for mac (beta version exists but future seems uncertain). Not sure about other OSs. I'm pretty sure I can hide the docker interactions in the backend so that users are only exposed to (pretty much) the list of functions you wrote.

Considering above, I'll work on incorporating the OS switch option (with docker for mac OS, and any other OSs that support docker) into your PR. Will push when I have something substantive.