liusihan / seGMM

A new tool to infer sex from massively parallel sequencing data.
MIT License
13 stars 2 forks source link

When running: plink --vcf my.vcf --make-bed --out sex/plink An error was occured, please check the parameters! #7

Open UndressK opened 1 year ago

UndressK commented 1 year ago

Hello,

I am getting an error when trying to run the software. Here is my code:

seGMM --vcf my.vcf \
              --input bam.file --alignment_format BAM \
              --reference_additional ~/resources/seGMM/ref_1000G_WES.txt \
              -t WES -o sex -g hg38

bam.file: S1001 ~/data/S1001/S1001.bam

Any ideas why is failing? Thank you!

liusihan commented 1 year ago

Hello,

Was there any presence of abnormal chromosomes other than chr1 to chr22, chrX, and chrY in your VCF file? It seems that the issue with seGMM arises when the input VCF contains ALT contigs such as chrX_KI270880v1_alt. Please refer to similar questions in #6.

Thank you!

UndressK commented 1 year ago

Thanks for your reply! My chromosomes are named without the "chr" in the vcf file, so just: "1", "2",..."X","Y","MT". I have some alternative chromosome contigs, but only in the header: "##contig=" There does not seem to be variants called in the rest of the vcf files.. Will it fail also if chromosomes are not named like "chr1","chr2"..?

Andrés C.

On Mon, 5 Jun 2023 at 03:01, Sihan Liu @.***> wrote:

Hello,

Was there any presence of abnormal chromosomes other than chr1 to chr22, chrX, and chrY in your VCF file? It seems that the issue with seGMM arises when the input VCF contains ALT contigs such as chrX_KI270880v1_alt. Please refer to similar questions in #6 https://github.com/liusihan/seGMM/issues/6.

Thank you!

— Reply to this email directly, view it on GitHub https://github.com/liusihan/seGMM/issues/7#issuecomment-1575891030, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3HPTLBUCXNTYO7F7ARM5TXJUVWDANCNFSM6AAAAAAYZGSWRU . You are receiving this because you authored the thread.Message ID: @.***>

UndressK commented 1 year ago

I tried renaming the chromosomes by appending "chr" to CHROM column, but I am still getting the same error.

liusihan commented 1 year ago

Hi Andrés,

seGMM has the capability to predict sex using a VCF file that contains chromosomes named with or without the "chr" prefix. If possible, could you upload the vcf file you used or try the plink code as follows: plink --vcf vcffile --make-bed --out output --allow-extra-chr.

Sihan

UndressK commented 1 year ago

Hi Sihan,

thanks for your reply! sorry, where exactly should I add that line of code?

On Tue, 6 Jun 2023 at 05:52, Sihan Liu @.***> wrote:

Hi Andrés,

seGMM has the capability to predict sex using a VCF file that contains chromosomes named with or without the "chr" prefix. If possible, could you upload the vcf file you used or try the plink code as follows: plink --vcf vcffile --make-bed --out output --allow-extra-chr.

Sihan

— Reply to this email directly, view it on GitHub https://github.com/liusihan/seGMM/issues/7#issuecomment-1577861552, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3HPTJES7A5XEWIZWMK47TXJ2SP5ANCNFSM6AAAAAAYZGSWRU . You are receiving this because you authored the thread.Message ID: @.***>

liusihan commented 1 year ago

Hi Andrés,

You need to activate the seGMM environment with conda and run the plink code in a shell window.

UndressK commented 1 year ago

plink --vcf vcffile --make-bed --out output --allow-extra-chr

plink: unknown option "--vcf"

plink: unknown option "--make-bed"

plink: unknown option "--out" Andrés C.

On Wed, 7 Jun 2023 at 03:50, Sihan Liu @.***> wrote:

Hi Andrés,

You need to activate the seGMM environment with conda and run the plink code in a shell window.

— Reply to this email directly, view it on GitHub https://github.com/liusihan/seGMM/issues/7#issuecomment-1579733029, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3HPTKWVYCE636FW7HHEMTXJ7M43ANCNFSM6AAAAAAYZGSWRU . You are receiving this because you authored the thread.Message ID: @.***>

liusihan commented 1 year ago

Please test the code as follows:

(base) [shliu@admin1 ~]$ conda activate seGMM
(seGMM) [shliu@admin1 ~]$ ls
test.vcf
(seGMM) [shliu@admin1 ~]$ plink --vcf test.vcf --make-bed --out test0608 --allow-extra-chr
PLINK v1.90b6.21 64-bit (19 Oct 2020)          www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to test0608.log.
Options in effect:
  --allow-extra-chr
  --make-bed
  --out test0608
  --vcf test.vcf

257437 MB RAM detected; reserving 128718 MB for main workspace.
--vcf: test0608-temporary.bed + test0608-temporary.bim + test0608-temporary.fam
written.
145345 variants loaded from .bim file.
10 people (0 males, 0 females, 10 ambiguous) loaded from .fam.
Ambiguous sex IDs written to test0608.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 10 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.0529802.
145345 variants and 10 people pass filters and QC.
Note: No phenotypes present.
--make-bed to test0608.bed + test0608.bim + test0608.fam ... done.
vikramadhithyaThotam22 commented 2 months ago

after that how to find the sex determination using seGMM tool