Closed jmarshall closed 2 years ago
Thanks for the feedback. I think we can get rid of the cram/*.h
files. AFAIK no Bioconductor package is using them yet. Then we should also be able to get rid of header.h
, which was only installed because cram/cram.h
needs it.
I'll also look into getting rid of textutils_internal.h
and hts_internal.h
. They were added a long time ago but maybe some Bioconductor packages are still using them, in which case those packages will first need to be modified.
@mtmorgan It looks like in the cram
branch of Rsamtools you're including cram/cram.h
: https://github.com/Bioconductor/Rsamtools/blob/6bab10b94573cc6cdc5f9aa0585f9246dae9b7ca/src/hts_utilities.c#L2
This is the only reason I exposed the cram/*.h
files in the latest Rhtslib. Do you think you can use htslib/cram.h
instead? Thanks.
Yes I'll adjust
Ok so I reverted commit 76c5c2362a272e9d46638638851cb78aebdc8768. See commit 14645306bf45504caca6dec13e07693e23a5b6f3.
Turns out that we still need to install hts_internal.h
(and consequently textutils_internal.h
, which is used in hts_internal.h
) because Rsamtools makes calls to hts_idx_getfn()
(see here), a function that is declared in hts_internal.h
.
@jmarshall Any chance hts_idx_getfn()
can be moved to HTSlib's public API, or should we use something else in Rsamtools?
Thanks, H.
There's no particular need to preflight the index filename. I would suggest using bcf_index_load()
instead of bcf_index_load2()
, and rephrasing the resulting error message as per the previous one:
bfile->index = bcf_index_load(fn);
if (bfile->index == NULL) {
…
Rf_error("no valid VCF/BCF index found\n filename: %s", fn);
Ah I see. This code is a relic from the samtools 0.1.19 era when it seems that the only way to load the BCF index was to call bcf_idx_load()
on the index filename. When Rsamtools got migrated from samtools 0.1.19 to htslib 1.7 a few years ago, this code didn't get updated in a way that would take advantage of bcf_index_load()
.
I made the change in Rsamtools 2.13.3. Thanks for the suggestion.
Also modified Rhtslib to no longer install textutils_internal.h
and hts_internal.h
. This is in Rhtslib 1.99.5 (commit a5335e1fea92ea06c33d42c7e290b6d8b4245ad3).
Cheers, H.
I notice that the recent 76c5c2362a272e9d46638638851cb78aebdc8768 makes a number of HTSlib's private internal header files available to Rhtslib users: the internal cram/*.h headers and the internal header.h; meanwhile the internal _htsinternal.h, and _textutilsinternal.h have already been made available since Rhtslib 1.14.0.
Only the API functions in HTSlib's *public headers in _htslib-x.y/htslib/.h_** are supported stable interfaces and not subject to change.
As @jkbonfield can confirm, HTSlib's public CRAM-specific interfaces are those defined in htslib/cram.h. You do not need to install any of the internal headers in cram/*.h and should not, as they are likely to change incompatibly without notice. If any dependent R libraries do need functionality from these internal headers, you should raise an issue on HTSlib to have the desired functionality or an equivalent added to the public htslib/cram.h header.
Also header.h is an internal HTSlib header that is likely to change incompatibly without notice. The supported public interfaces to this header functionality are in htslib/sam.h, so there is no need to install the internal header.h header file.
Similarly there is no supported public functionality in _htsinternal.h or _textutilsinternal.h — the clue's in the name! — so there should be no need to make these available to Rhtslib users either. I see these have been being installed since Rhtslib 1.14.0 — what was the rationale for this? Have they ever been used by an R library?