Open maheydari opened 6 years ago
git clone --recurse-submodules https://github.com/dpryan79/SE-MEI
cd SE-MEI
make
Or something along those lines.
Thanks for your answer I could successfully compile the code. But you have provided htslib folder along with the project which is the older version. I downloaded the latest version of it and compiled and replaced it with your older version. Now Again I want to compile your code and it gives me these types of errors:
/home/mahdi/SE-MEI2/htslib/hfile_libcurl.c:1138: undefined reference to `curl_easy_setopt'
/home/mahdi/SE-MEI2/htslib/hfile_libcurl.c:1124: undefined reference to `curl_multi_init'
/home/mahdi/SE-MEI2/htslib/hfile_libcurl.c:1127: undefined reference to `curl_easy_init'
/home/mahdi/SE-MEI2/htslib/cram/cram_io.c:966: undefined reference to `BZ2_bzBuffToBuffDecompress'
Could you please help me understand what I did wrong probably? Can I laster install hstlib in in the subdirectory but in my home and link it to this program?
Don't download the latest version. This is an old project, it was written for the old htslib version it comes with.
Thank you very much. Your comments are always helpful especially in the biostars community. I want to write a very simple code to read a bam file using the latest version of hstlib. I wanted to get some idea from your code to see how can I do that. If you have any idea please let me know.
Ah, well you can get the gist of things from my code. The htslib API hasn't really changed that drastically since I wrote this.
Indeed it's very useful. I am looking for a functionality similar to seekg or tellg while reading bam file. I don't want to parse the reads from the beginning, instead, I want to randomly go through it and parse some of them. I couldn't find it in this project, don't you have any similar experience?
You can seek to arbitrary blocks, likely using non-exported functions from htslib, but you can't just say "seek to read 123456", since the files aren't structured in a way to make that practical. I suggest you either use reservoir sampling or read the header and then sample random intervals. The files are structured such that it's quick to get reads overlapping a given interval. We use that methodology in deepTools when we need to randomly sample reads.
I ended up with a solution yesterday which is probably also what you are suggesting to do, Actually, I was looking at this project https://github.com/hasindu2008/simple_bam_parser/blob/master/randomacess.c I only changed the given interval which was based on a string to two int numbers to specify the start and end ( are the start and end , actually the first and the last read in the chunk or i missunderstood? ). Here is my code (not the whole but the related part):
hts_idx_t *idx=NULL;
samFile *in = NULL;
hts_itr_t *iter=NULL;
in = sam_open(argv[1], "r");
idx = sam_index_load(in, argv[1]);
iter = sam_itr_queryi(idx, 0, 10, 100);
b = bam_init1();
while ( sam_itr_next(in, iter, b) >= 0){
cout <<b->core.pos <<endl;
}
Is that correct? and the question is does sam_index_load function loads the whole bam file into memory? I see a sudden jump in memory usage when I call it.
I already appreciate you dedicate your time to answer my random questions.
Update : I found your answer here to a similar question which is also very similar to the above project : https://www.biostars.org/p/151053/
It only loads the index into memory, which is rather small. The memory jump likely has more to do with how may reads are in the bgzip blocks, since you have to load one of those into memory to decompress it.
Hi @dpryan79, I'm using this as a dependancy of ERVtools and ran into problems with compilation too. Eventually found a solution, but it took a while. htslib 1.1 works fine with this package https://github.com/samtools/htslib/releases/tag/1.1 Could you include this in the readme.md? Thanks
@markziemann I've mentioned that in the readme just now, the submodule itself should actually be using 1.1 already, though I guess if you update it then that'd cause an issue. If you end up needing a newer htslib then let me know and I'll try to update this (I haven't really used this in a number of years).
could you please write a minimal code to compile this program? Actually, the reason I am looking at it is I am looking for an example to use samtools C-API . I saw some codes from you using this API, but I couldn't compile them.