Closed wbs-whuer closed 4 months ago
Hi @wbs-whuer,
Thanks for your question. Based on your question, I have a quick comment for you to check:
In your aGDS file, thevariant.id
, position
, chromosome
, allele
, genotype
and QC parameters (e.g. annotation/filter
) show that there are 2,949,744 variants in the chromosome 21 file. However, all of your functional annotation fields (i.e. under annotation/info/FunctionalAnnotation
) show 2,946,875 variants in the chromosome 21 file. As a sanity check before running STAARpipeline, these two numbers and the order of variants should be exactly the same. The current discrepancy between the two numbers (2,949,744 vs 2,946,875) could be the reason for the issue you mentioned above, and I would recommend you double-check on the FAVORannotator step once again.
Hope this helps.
Best, Xihao
Thanks for your reply! This error disappear after I restrict variants in genofile to those annotated in FAVORannotator. However, in your source code (such as noncoding.R
), filtering in gds file (seqSetFilter(genofile, variant.id=variant.id.keep)
) is reset many times (seqResetFilter(genofile)
). Maybe I can extract variants annotated in FAVORannotator in my original vcf file and then convert it to gds file and annotate it once again. But I wonder is there any efficient way to avoid this error? Such as any method to modify the gds file and save it as a new gds file? Thank you!
Hi @wbs-whuer,
Thank you for your reply. Glad to hear that this is relevant to the cause of the issue. My suggestion is to double-check the FAVORannotator step since it is expected the number of variants in variant.id
(and genotype
etc.) will be the same as the functional annotations (e.g., genecode_comprehensive_category
) after running FAVORannotator (even if there are variants in your original vcf file but not annotated in FAVOR database) such that all downstream STAARpipeline steps should work well without any issue.
If possible, could you please provide some examples of variants in your original vcf file but not annotated in FAVORannotator (the difference between 2,949,744 and 2,946,875)?
Best, Xihao
Thanks for your reply! I upload a file which contains variants that failed in the annotation step. And is seems that those alles are on the opposite strand. chr22_notmatch.txt
Hi @wbs-whuer,
Thanks for your updates. These allele matching issues should have been addressed once the GDS files are created (before FAVORannotator step), so you should consider flipping them as necessary in the original file format (e.g. VCF). Also, please double check your original file format and see if there are any specific issues (e.g. certain variants do not belong to a chromosome but are stored in the corresponding VCF files, etc.).
Best, Xihao
Hi Xihao,
Thanks for designing such a useful tool. I'm trying to run _STAARpipeline_Gene_CentricNoncoding.r in _Step 3.2: Gene-centric noncoding analysis_in my way, and the error below showed up:
And I run the code in noncoding.R line by line and it seems to report an error when I run
In this code,
paste0(Annotation_dir,Annotation_name_catalog$dir[which(Annotation_name_catalog$name=="GENCODE.Category")])
refers to "annotation/info/FunctionalAnnotation/genecode_comprehensive_category". And I double check the information in my genofile, and "annotation/info/FunctionalAnnotation/genecode_comprehensive_category" is indeed in genofile.As I can run _STAARpipeline_Gene_CentricCoding.r without any error and I find there is a similar code in coding.R(line 64), which filter variants and samples before seqGetData(). And the code
GENCODE.Category <- seqGetData(genofile, paste0(Annotation_dir,Annotation_name_catalog$dir[which(Annotation_name_catalog$name=="GENCODE.Category")]))
can run successfully after restrict variants. There were 2,949,744 variants before filter in my genofile and I guess is it possible there was too much variants in my genofile which cause this error? Should I filter some variants (such as MAF or other quality control indicators) before run this code?Thank you very much for your time and help, Bangsheng