weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
187 stars 72 forks source link

step2_SPAtests.R ignores --vcfFileIndex in v0.44.6.5 #370

Closed nkgx closed 2 years ago

nkgx commented 2 years ago

I am running SAIGE with a VCF + TBI index (not CSI) by providing a --vcfFileIndex=abc.vcf.gz.tbi parameter. This used to work in older versions but in v0.44.6.5 I am getting this error:

Error in SPAGMMATtest(vcfFile = opt$vcfFile, vcfFileIndex = opt$vcfFileIndex,  : 
  ERROR! vcfFileIndex abc.vcf.gz.csi does not exist

I believe this is caused by these lines, which will always look for a vcf.gz.csi file even if --vcfFileIndex is provided: https://github.com/weizhouUMICH/SAIGE/blob/57c43dbc6622a154f434fcf696e767b1820f1e7a/R/SAIGE_SPATest.R#L294

Would it be possible to fix this?

mtdrlv commented 2 years ago

I just stumbled upon the same error with v.44.6.5. Which version is safe to use with vcf +tbi files until this problem is fixed?

ytao0210 commented 2 years ago

You can use "tabix -C *.vcf.gz" to generate a csi index.

mtdrlv commented 2 years ago

I generated csi index files and these worked OK

On Wed, Jan 19, 2022 at 10:28 AM ytao0210 @.***> wrote:

I met the same problem. How did you solve this?

— Reply to this email directly, view it on GitHub https://github.com/weizhouUMICH/SAIGE/issues/370#issuecomment-1016196205, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALEVTLGZ5RH6UO4QNIWFPFDUWZYZVANCNFSM5EQ6DBQA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

pettyalex commented 2 years ago

I quick glance at the code made it look like SAIGE step2_SPAtests.R now requires both .tbi and .csi indexes, because it's using the savvy library which requires .csi and guesses the file-name based on the vcf.gz file name, but I saw R code in SAIGE that still uses the vcfFileIndex.

@weizhouUMICH is the plan for SAIGE to move to only using .csi indexes?

Edit: I took a bit of a look around. The value passed to the vcfIndex is no longer used, and only CSI indexes are used. The change was made here, and mentioned in the PR discussion: https://github.com/weizhouUMICH/SAIGE/pull/305. However, SAIGE documentation hasn't been updated, and we're still passing around an unnecessary vcfiIndex variable.

weizhouUMICH commented 2 years ago

Sorry for the late reply! We have just released a new version 1.0.0. It now only uses csi for VCF index. It has computational efficiency improvements for both Step 1 and Step 2 for single-variant and set-based tests. We have created a new program github page https://github.com/saigegit/SAIGE with the documentation provided https://saigegit.github.io/SAIGE-doc/ The program will be maintained by multiple SAIGE developers there. The docker image has been updated. Please feel free to try the version 1.0.0 and report issues if any.

Thanks! Wei