hardingnj / xpclr

Code to compute the XP-CLR statistic to infer natural selection
MIT License
90 stars 27 forks source link

Just started with xpclr #99

Open jahneldardavik opened 1 year ago

jahneldardavik commented 1 year ago

Hello, Being a total newbie to this software I really struggle to find a tutorial that can get me started. So, I take my chances here: I have two populations that I would like to compare. I have made the genofiles for the two populations of them using this format: 1 0 1 1 9 9 1 1 1 0 0 0 A format that I found in a - possibly deprecated - instruction. BTW the numbers are separated by spaces. The map is set up with: 'snpname', 'chr#', 'cM', 'bpPos', 'Ref','Alt'. The columns in this map file are separated by tabs '\t'.

Could anyone indulge me with a typical call to the the xpclr running? I am sure I can refine the paramters later on - but for now I need to get it going. I did this one: xpclr genofile1.geno genofile5.geno annot.snp -O testCLR -w1 0.005 200 2000 -C 1 -p0

which gave multiple errors. Anyone? I'd be much obliged, Thanks, Jahn Davik

genofile1.txt annot.txt genofile5.txt

jahneldardavik commented 1 year ago

I've come a small step on my newbie issues. I use the call:

xpclr -F "txt" --popA genofile1.geno --popB genofile5.geno --map annot.snp -O testCLR -C 1

But get these error messages thrown:

2023-10-16 08:33:39 : INFO : running xpclr v1.1.2 2023-10-16 08:33:39 : INFO : Loading TXT 2023-10-16 08:33:39 : WARNING : Possible SNPs file is not sorted. Attempting to sort. This is likely to be inefficient Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/xpclr-1.1.2-py3.8.egg/xpclr/util.py", line 73, in load_text_format_data File "/usr/local/lib/python3.8/dist-packages/allel/model/chunked.py", line 804, in init self.set_index(index) File "/usr/local/lib/python3.8/dist-packages/allel/model/chunked.py", line 812, in set_index self.index = SortedIndex(self[spec][:], copy=False) File "/usr/local/lib/python3.8/dist-packages/allel/model/ndarray.py", line 3384, in init raise ValueError('values must be monotonically increasing') ValueError: values must be monotonically increasing

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/xpclr", line 4, in import('pkg_resources').run_script('xpclr==1.1.2', 'xpclr') File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 667, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 1470, in run_script exec(script_code, namespace, namespace) File "/usr/local/lib/python3.8/dist-packages/xpclr-1.1.2-py3.8.egg/EGG-INFO/scripts/xpclr", line 196, in File "/usr/local/lib/python3.8/dist-packages/xpclr-1.1.2-py3.8.egg/EGG-INFO/scripts/xpclr", line 130, in main File "/usr/local/lib/python3.8/dist-packages/xpclr-1.1.2-py3.8.egg/xpclr/util.py", line 77, in load_text_format_data File "/usr/local/lib/python3.8/dist-packages/allel/model/chunked.py", line 804, in init self.set_index(index) File "/usr/local/lib/python3.8/dist-packages/allel/model/chunked.py", line 812, in set_index self.index = SortedIndex(self[spec][:], copy=False) File "/usr/local/lib/python3.8/dist-packages/allel/model/ndarray.py", line 3384, in init raise ValueError('values must be monotonically increasing') ValueError: values must be monotonically increasing

I have sorted the map file according to chromosome and position, but that is apparently wrong, or? Appreciate any help.

Thanks, jahn

hardingnj commented 1 year ago

Hi @jahneldardavik

This is academic level software- that is not being maintained. I wrote it for a project a few years ago and decided to make it available to others. It's certainly not well tested/robust.

I'm happy to give pointers- but I can't guarantee performance or fix bugs. Also very happy to review PRs.

In this case the problem seems to be rooted at scikit-allele VariantTable so would suggest trying to read in your data there to see what is causing the issue.

Maybe a quick fix would be to split the input data by chromosome- that might be causing the issue.