szpiech / selscan

Haplotype based scans for selection
GNU General Public License v3.0
111 stars 33 forks source link

output file empty and command line flag --pmap not recognized. #123

Open SquRunner opened 3 weeks ago

SquRunner commented 3 weeks ago

I am hoping to run iHS and xpEHH on a phased vcf from a non-model organism and am running into the error where my output files are empty. I thought perhaps this was due to my .map files and I attempted using the --pmap command but get the error:
ERROR: command line flag --pmap not recognized.

My command was: selscan --ihs --vcf asf.vcf.gz --pmap --keep-low-freq --out asf

or selscan --ihs --vcf asf.vcf.gz --map asf.map --keep-low-freq --out asf

My .map file looks like:

chr34 chr34-1102 1102 1102 chr34 chr34-1212 1212 1212 chr34 chr34-1236 1236 1236 chr34 chr34-1261 1261 1261

An example of the specific log outputs are:

WARNING: Reached chromosome edge before EHH decayed below 0.05. Skipping calculation at chr34-1102 WARNING: Locus chr34-1212 has MAF < 0.05. Skipping calculation at chr34-1212 WARNING: Locus chr34-1236 has MAF < 0.05. Skipping calculation at chr34-1236

Any insight would be amazing. I am using an older version (selscan v1.2.0) as my computing cluster relies on conda/mamba installation and this is the most recent version currently available. So, I guess I wonder if you have any suggestions for updating the version given this situation, as I guess this might also help solve my command line issues?

Thank you so much for the help!

szpiech commented 3 weeks ago

Hi,

Well, if you can’t have a more recent version installed on your cluster, then you can mimic the behavior of —pmap by creating a map file and putting the physical distance in the genetic map column. I’d recommend dividing the physical position by 1,000,000 when doing this, as it will improve runtime.

Zachary

Le dim. 18 août 2024 à 21:56, SquRunner @.***> a écrit :

I am trying to run iHS and xpEHH on a phased vcf from a non-model organism and am running into the error: ERROR: command line flag --pmap not recognized.

My command is: selscan --ihs --vcf asf.vcf.gz --pmap --keep-low-freq --out asf

I am using an older version (selscan v1.2.0) as my computing cluster relies on conda/mamba installation and this is the most recent version currently available. I wonder if you have any suggestions for updating the version given this stituation, as I assume this will also help solve my command line issue?

Thank you so much for the help!

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/123, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQWNQHBTXBW6GH446M3ZSFGGJAVCNFSM6AAAAABMW4UILWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3TEMRZGQ3DGMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

SquRunner commented 3 weeks ago

Thank you for the comment! This is actually what I had been trying. The .map file (see above) has the four columns "CHROM", "SNP ID", "PHYSICAL POSITION", "PHYSICAL POSITION". I had not divided by 1,000,000. But am not sure this should effect the error I am getting in the log files (i.e. no output to the output file and only warnings in the log file)? But I a very likely missing something. Thanks!

szpiech commented 3 weeks ago

So you should pass the file with —map not —pmap as the latter is a true/false flag. Dividing by 1000000 is not required, but it will make the calculation run faster.

Zachary

Le lun. 19 août 2024 à 12:34, SquRunner @.***> a écrit :

Thank you for the comment! This is actually what I had been trying. The .map file (see above) has the four columns "CHROM", "SNP ID", "PHYSICAL POSITION", "PHYSICAL POSITION". I had not divided by 1,000,000. But am not sure this should effect the error I am getting in the log files (i.e. no output to the output file and only warnings in the log file)? But I a very likely missing something. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/123#issuecomment-2296982104, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQQT6UFIVFGTX2S23C3ZSINB3AVCNFSM6AAAAABMW4UILWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJWHE4DEMJQGQ . You are receiving this because you commented.Message ID: @.***>

SquRunner commented 3 weeks ago

Thanks! Sorry, I fear my description was a bit confusing. But I actually do this in the second command above: selscan --ihs --vcf asf.vcf.gz --map asf.map --keep-low-freq --out asf This is the command that results in the empty out file and the warnings in the log file. And using the .map file example listed above.

The command selscan --ihs --vcf asf.vcf.gz --pmap --keep-low-freq --out asf only results in an error that "--pmap is not recognized". Which is when I went on to try the command without --pmap and using the described .map file.

Maybe I should also mention that this happens for all chromosomes as well, and does not seem to be limited to just one short chromosome, for example.

Thanks!

SquRunner commented 3 weeks ago

Oh I am so sorry, I just went back and re-filtered my vcf for biallelic phased genetic data, and now the files seem to be running fine! There must have been some strange error in the original filtering step that resulted in this odd error... Thank you so much for your quick responses and I apologize for the confusion!

szpiech commented 3 weeks ago

Great, glad to hear it!

Le lun. 19 août 2024 à 15:06, SquRunner @.***> a écrit :

Oh I am so sorry, I just went back and re-filtered my vcf for biallelic phased genetic data, and now the files seem to be running fine! There must have been some strange error in the original filtering step that resulted in this odd error... Thank you so much for your quick responses and I apologize for the confusion!

— Reply to this email directly, view it on GitHub https://github.com/szpiech/selscan/issues/123#issuecomment-2297250362, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAKRQRSECP4SS4T3WWIILTZSI64JAVCNFSM6AAAAABMW4UILWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJXGI2TAMZWGI . You are receiving this because you commented.Message ID: @.***>