bjcairns / ukbschemas

Use R to generate a database containing the UK Biobank data schemas from http://biobank.ctsu.ox.ac.uk/crystal/schema.cgi
Other
20 stars 3 forks source link

Error in guess_header_ #30

Open MaximMoinat opened 3 years ago

MaximMoinat commented 3 years ago

When running db <- ukbschemas_db(), either with new temp path or referring to the sqlite file, it gives me the following cryptic error:

Error in guess_header_(datasource, tokenizer, locale) : 
  Expected single logical value

It used to work fine. Did any of the dependencies change?

bjcairns commented 3 years ago

Thank you very much for this. Somewhat delayed by the Christmas break, but we are taking a look now. We think it is the result of a dependency API change (in the purrr package), and we're currently removing this and some other dependencies to avoid this in future.

There are some further issues with file download from the UKB website, which we also have a workaround for.

We'll try to update with a working dev branch, while some of the above is still in progress.

ypouliot commented 3 years ago

Howdy Benjamin. Any news on this? :-)

bjcairns commented 3 years ago

We're making progress on this and it appears the original problem is solved, but in the process we have needed to rework the fix for additional issues like malformed files on the UKB side (particularly the "returns" table, which unfortunately has an unsanitised entry in the notes column). We're debugging some side effects of those fixes currently but need to get down to the individual functions that are failing.

The latest version is in the import-populate-fixes branch.

The current version at that branch (7b4a340) will successfully load most of the database into R, except (for reasons we don't fully understand yet) the schema table (the table which lists the schemas and their ID numbers). It might work for some purposes in the meantime, but it currently fails to update the SQLite database correctly.

EDIT: Just to add, that branch is experimental and includes some other API changes. Apologies if these trip anyone up, we'll try to iron out these wrinkles before it's merged into the main dev and then master branches.

MaximMoinat commented 3 years ago

Awesome, thanks for keeping us up-to-date. Your work is massively appreciated!

bjcairns commented 3 years ago

The latest commit in import-populate-fixes, which is c804e6d at time of writing, now passes all the tests that are currently being run. 🎉

Five tests are skipped, because there are (what seem to be minor, but do let us know) parsing errors relating to text encodings in the schema files from UKB. We hope to fix these, but for now skip the tests that check that there are no warnings from ukbschemas() and ukbschemas_db().

Some other updates relative to an earlier version in this branch:

This issue will stay open until it is solved in the current release version. Thanks for your input, encouragement, and patience!

xfhxfyyaxw commented 2 years ago

Hi Benjamin, I am using the db <- ukbschemas_db(path = tempdir()), but still got the Error: Expected single logical value. May I ask how to get rid of the Error? Many thanks:)