grimbough / rhdf5

Package providing an interface between HDF5 and R
http://bioconductor.org/packages/rhdf5
59 stars 22 forks source link

fix UTF8 string encoding #101

Closed ilia-kats closed 2 years ago

ilia-kats commented 2 years ago

h5createAttribute/h5writeAttribute needed the argument to be "UTF-8" and passed it to H5Tset_cset, which needed the argument to be "UTF8". Therefore, setting the encoding to UTF-8 failed. This patch changes h5createAttribute and h5writeAttribute to accept "UTF8".

grimbough commented 2 years ago

Thanks a lot for pointing this out. I'm inclined to go the other way and use the "UTF-8" notation, principle because that's the official name of the encoding.

I see there's also some inconsistencies where this is set either via the cset or encoding arguments depending on whether its and attribute or a dataset that's being created. I'll think I'll set both to use encoding going forward, but I'll only deprecated the cset argument so it'll still work if you're currently using it.

ilia-kats commented 2 years ago

Thanks for the feedback, I updated the PR to use UTF-8instead of UTF8.

grimbough commented 2 years ago

Did you see that I made a pull request on your own repo with some additional changes? Those pass R CMD check ( https://github.com/grimbough/rhdf5/actions/runs/1515325296)

It'd be good if you could look over those, so we can keep in sync. I can't keep track of what you're doing with these force pushes.

ilia-kats commented 2 years ago

Apologies, I didn't get an email for your PR and so I missed it. I merged it now, thanks.

grimbough commented 2 years ago

No problem. I'll wait until the CI check have passed, and then merge this master and push to Bioconductor.

That is, assuming the changes do what you want!

ilia-kats commented 2 years ago

Great, thanks. My usecase for the moment is pretty much covered by the h5createAttribute/h5writeAttribute part, so yeah, it does what I want :)

codecov[bot] commented 2 years ago

Codecov Report

Merging #101 (c7bb091) into master (87623b2) will increase coverage by 0.02%. The diff coverage is 82.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #101      +/-   ##
==========================================
+ Coverage   77.82%   77.85%   +0.02%     
==========================================
  Files          35       35              
  Lines        1876     1892      +16     
==========================================
+ Hits         1460     1473      +13     
- Misses        416      419       +3     
Impacted Files Coverage Δ
R/h5create.R 84.48% <55.55%> (-1.75%) :arrow_down:
R/H5T.R 91.89% <100.00%> (+3.00%) :arrow_up:
R/h5write.R 83.91% <100.00%> (+0.11%) :arrow_up:
R/h5writeAttr.R 90.24% <100.00%> (+2.00%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d7e1897...c7bb091. Read the comment docs.

grimbough commented 2 years ago

This has now been merged and should be available from Bioconductor in rhdf5 version 2.39.2 in the next day or so. Thanks a lot for the help.