hzi-bifo / Haploflow

GNU General Public License v3.0
25 stars 3 forks source link

Problem with HIV-3 test date set #19

Closed Viola-TA closed 1 year ago

Viola-TA commented 1 year ago

Hello all, I have installed haploflow in its own environment. haploflow --help also shows me the corresponding result. Then I wanted to test the function of haploflow with the HIV_3_toy-file. But I get an error that I don't understand. And so far I have not been able to find a solution.

My used command: ./haploflow --read-file ~/Downloads/HIV_3_toy.fq.gz --out Test_Haploflow --log Test_Haploflow/log

My mistake: terminate called after throwing an instance of 'std::out_of_range'. what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0) Aborted (core dumped)

The folder _TestHaploflow is created. There are two files in this order: Cov.tsv (o bytes); log (448 bytes).

The log file is very short: Options used: strict 5, k 41, error-rate 0.02, two-strain False, long contigs: False, filter 500, threshold -1 Building deBruijnGraph... Building deBruijnGraph took 0.000351 seconds. deBruijnGraph has 0 vertices Building unitig graph from deBruijn graph... Getting connected components Getting CCs took 7e-06 seconds Calculating coverage distribution Calculating coverage distribution took 1.6e-05 seconds Printing the biggest graph with 0 k-mers

Can anyone help me with this problem? Many thanks, Viola

AlphaSquad commented 1 year ago

Hi Viola, unfortunately in the bioconda version of Haploflow (1.0) the handling of compressed files was not added yet, so you would need to gunzip the toy data set first. Let me know if that resolves your error, thanks!

Viola-TA commented 1 year ago

Yes, I was able to start the programme with the decompressed file. But it only ran for a few seconds. The folder was created with three files: contigs.fa (with 3 contigs), Cov.tsv, log. But the subfolders are missing. log.txt

AlphaSquad commented 1 year ago

The toy data set consists of 3 HIV strains and is relatively small, so Haploflow running only a couple of seconds and producing exactly 3 contigs is expected behaviour. I noticed that the documentation on the front page is not up-to-date. The sub-folders are only created if you run Haploflow with the --debug flag (otherwise Haploflow would flood these folders with a lot of files for bigger data sets). I updated this information on the frontpage, thanks for bringing this to my attention

Viola-TA commented 1 year ago

Ah, ok. Then Haploflow seems to be running error-free for me now. Many thanks for the information and your help.