PoisotLab / GBIF.jl

Functions and types to access GBIF data from Julia
https://ecojulia.github.io/GBIF.jl/latest/
Other
19 stars 4 forks source link

add Tables.jl interface #41

Closed rafaqz closed 2 years ago

rafaqz commented 3 years ago

This PR adds a standard Tables.jl interface and removes Requires.jl and DataFrames code, as it's no longer needed.

Now a CSV can be saved directly from a GBIFRecords object without DataFrames.jl being involved.

A DataFrame will now include all fields as-is form the GBIFRecord (which may be up for debate, but e.g. basisOfRecord can be good to have in a table...), as well as the taxon fields previously included.

This is done by adding methods to propertynames and getproperty, which means a GBIFRecord also has the same properties accessible with the record.property syntax. To get the actual taxon object, you use getfield(record, :taxon). Or we can add a wrapper method.

rafaqz commented 3 years ago

The error seems to be an issue with DataFrames and CategoricalArrays? wierd

tpoisot commented 3 years ago

Oh it's what is making the tests fail in #42 - I like the idea (@gabrieldansereau might like it even more as he's our resident "export thing as CSV" evangelist).

If the issue really is 1.5, we might make this a major version and set 1.6 as the lower bound.

rafaqz commented 3 years ago

The docs on all these PRs also have a problem with QT not finding the xcb library to run visually without X on linux.

codecov[bot] commented 3 years ago

Codecov Report

Merging #41 (1e04917) into main (23b6921) will decrease coverage by 1.03%. The diff coverage is 81.81%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #41      +/-   ##
==========================================
- Coverage   85.95%   84.91%   -1.04%     
==========================================
  Files          10        9       -1     
  Lines         178      179       +1     
==========================================
- Hits          153      152       -1     
- Misses         25       27       +2     
Flag Coverage Δ
unittests 84.91% <81.81%> (-1.04%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/tables.jl 66.66% <66.66%> (ø)
src/types/GBIFRecords.jl 90.62% <87.50%> (-1.05%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 23b6921...1e04917. Read the comment docs.

rafaqz commented 3 years ago

The problem was just old dependencies in the tests. I'm not sure why Pkg chose them, but adding fixed versions like 1 for DataFrames seems to have fixed the issue. This is good to go.

rafaqz commented 3 years ago

I would really like to merge this... but it needs review some.

Specifically it may break everything that built off dataframes output before because it includes most of the fields now, and some appear to have previously been named differently to GBIF, and to the struct field names? Now there is no distinction.

tpoisot commented 3 years ago

Can you rebase on the main branch? The type piracy issue is gone

tpoisot commented 3 years ago

I really want a review from @gabrieldansereau on this - if it's going to break things in your workflows, then maybe we shouldn't merge it yet.

tpoisot commented 3 years ago

I also won't merge unless properly documented - I'll start my review when all the elements are here.

rafaqz commented 3 years ago

I also won't merge unless properly documented - I'll start my review when all the elements are here.

Before I start, can I clarify what you want documented? Nothing in the current docs appears to be changed by this PR.

We could however add a listing of available table columns, which are not currently documented.

tpoisot commented 3 years ago

We could however add a listing of available table columns, which are not currently documented.

Yep

rafaqz commented 3 years ago

This is documented and ready to go, besides possible changes to GBIFRecord that would make the code a lot simpler.