tzcoolman / FACS-OLD

0 stars 2 forks source link

API for DRASS #20

Closed brainstorm closed 11 years ago

brainstorm commented 11 years ago

Since we want to have a Python interface wrapped as a C extension, we think it's good to establish a common API for DRASS that will survive internal changes and refactoring. In python, it would look like this:

#!/usr/bin/env python

from drass import *

build("example.fa", "example_filter.bloom")
query("example_string", "example_filter.bloom", mode=1, k_mer=21, tole_rate=0.8, error_rate=0.0005, sampling_rate=1, prefix=NULL)
remove_contaminants("reference", "fastq_file")

Therefore the C counterpart should call the DRASS internal methods, but have a more accessible/simple prototype for the developer:

int build(char* reference_genome_filename, char* bloom_filename) /* return int should respect exit codes 0=success */
query(char* example_string, char* example_filter.bloom, int mode=1, int k_mer=21, int tole_rate=0.8, int error_rate=0.0005, int sampling_rate=1, char* prefix=NULL)

etc...
tzcoolman commented 11 years ago

Hej Roman,

I have built sp_build which is used for rapid bloom build. It is in Zlibc branch. Try that and give me feed back maybe

Enze

brainstorm commented 11 years ago

Enze, it looks like you have not pushed that branch (cannot see it). Git by default only pushes the master branch unless you specify:

git push --all
tzcoolman commented 11 years ago

Sorry... didnt notice that. I pushed all now

brainstorm commented 11 years ago

Thanks for this first approach to the API Enze!

I've merged and rewrote my python wrapper:

https://github.com/brainstorm/DRASS/commit/5594bbec79177d1aac1497d7b7a97a73a1492e78

But I get Segmentation fault after this:

hashes was -1272106950,size 2035040460729570645

I guess it's a matter of some missing initialization. Have you tested the C interface yourself? How do you call sp_build?

Thanks again Enze!

brainstorm commented 11 years ago

That's progressing very well, the only function left to implement is the contamination removal since there are some global variable issues during the refactoring.

There are also concerns regarding how to do contamination removal when files are compressed. General principle coming out in the meeting were:

Working in chunks in RAM and avoiding temp files.

Closing, we'll open a specific issue for contamination removal when is due.