Open jkbonfield opened 8 years ago
Most of this is due to backwards compatibility, but it would be good to make all the orthogonal API functions exist too so the code is more readable and then to document which ones are supported and which are deprecated. It may also just be an issue of fixing the existing functions. Eg hts_idx_load does exist, but it's not actually the opposite of hts_idx_destroy (load doesn't load CRAM indices, but destroy does destroy them).
+1
Future development would be much faster once the basics are sane.
Dr Adam Spargo room: N312 Vertebrate Resequencing email: aws@sanger.ac.uk Wellcome Trust Sanger Institute Tel: +44 (0)1223 834244 Hinxton, Cambridge CB10 1SA ext: 8633
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
I agree.
A suggestion for functions and structures dedicated to sam, bam and cram files, headers or records alike: aln_...
In writing a quick bam_consensus tool I ended up with a structure like this:
It's pretty bizarre to have so many prefixes in there. Add to this the fact that the bam_hdr_t struct is read using sam_hdr_read and destroyed with bam_hdrdestroy, and the index for the iterator being read using samindexload and destroyed with htsidx_destroy, we have both a mismash of sam/bam/hts format and index/idx in function names.
Most of this is due to backwards compatibility, but it would be good to make all the orthogonal API functions exist too so the code is more readable and then to document which ones are supported and which are deprecated. It may also just be an issue of fixing the existing functions. Eg hts_idx_load does exist, but it's not actually the opposite of hts_idx_destroy (load doesn't load CRAM indices, but destroy does destroy them).
It's a long term goal though and not something we have a great deal of time to do right now.