WHops / NAHRwhals

R package and wrapper functions for identifying serial structural variations from genome assemblies
MIT License
16 stars 2 forks source link

Unused argument #4

Open jiadong324 opened 3 months ago

jiadong324 commented 3 months ago

Dear author,

Thanks for developing this interesting tool! I've successfully tested on the demo data, but I got issues running the following command on my own assemblies.

I first tried the following command, but it gives me the unused arguments error.

nahrwhals(ref_fa='./T2T-CHM13v2.fasta',alt_fa='./HG00733.vkk.hap1.fasta',outdir='res',threads=10)

Then I found in your source code, the parameter is asm_fa instead of alt_fa as you used in README.md. Thus, I changed my code as below:

nahrwhals(ref_fa='./T2T-CHM13v2.fasta',asm_fa='./HG00733.vkk.hap1.fasta',outdir='res',threads=10)

However, I still got the unused arguments error.

Looking forward to your reply! Thanks!

WHops commented 3 months ago

Hi Jiadong,

Thx for your patience! I refactored bits and pieces of the code and made a new release. Could I ask you to install the latest version 1.4 and try again? Things should hopefully work now.

best Wolfy

jiadong324 commented 3 months ago

Hi Wolfy,

I got another error

[1] "No region or regionfile provided. Running whole genome discovery mode." 
[1] "No minimap2 index \".mmi\" file of one of the fastas found. Creating one now (takes around 1 minute for a whole human genome assembly.)"
[1] "No minimap2 index \".mmi\" file of one of the fastas found. Creating one now (takes around 1 minute for a whole human genome assembly.)"  
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
 line 1 did not have 2 elements
In addition: Warning messages:
1: In read.table(fasta_awk_tmp_file, stringsAsFactors = FALSE) :
 line 2 appears to contain embedded nulls
2: In read.table(fasta_awk_tmp_file, stringsAsFactors = FALSE) :
 line 3 appears to contain embedded nulls
3: In read.table(fasta_awk_tmp_file, stringsAsFactors = FALSE) :
 line 4 appears to contain embedded nulls 

Here are the outputs under outdir

drwxrwsr-x 2 jdlin eichlerlab 4096 Apr 16 08:48 minimap_idxs
drwxrwsr-x 3 jdlin eichlerlab 4096 Apr 16 08:49 whole-genome-windows
WHops commented 3 months ago

Hi Jiadong,

I haven't seen this error before, and have not managed to reproduce it. One thing I did notice is that the program is allergic to tabs or spaces in the contignames. Can that explain your error? Otherwise, which OS are you using?

What I would suggest to debug:

Thanks for sticking by! Anything that can help reproduce the error on my end would help a lot!

cheers Wolfy

jiadong324 commented 2 months ago

Hi,

I used both hg38 and CHM13 with targeted mode. It works and I feel that this is more like a targeted caller.

FYI

I do have a binary output file from get_fasta_info('./HG00733.vkk.hap1.fasta', 'out.txt')

It would be great if this can be applied genome wide.

jiadong324 commented 2 months ago

It is strange when I change to another regions, I get new errors:

[1] "Coordinates provided. Genotyping that region." 
[1] "Step 1: Identifying homologous regions in the y assembly" 
[1] "Step 2: Computing detailed pairwise alignment"
[1] "Step 3: Segmenting pairwise alignment"
[1] "Step 4: BFS search for mutation chains." 
ERROR: LoadError: ArgumentError: Package ArgParse not found in current path.
Run `import Pkg; Pkg.add("ArgParse")` to install the ArgParse package. 
Error in count.fields(juliares_path, sep = "\t") : 
cannot open the connection