zhilizheng / SBayesRC

GNU General Public License v3.0
22 stars 2 forks source link

impute file #19

Closed zyx125186 closed 4 months ago

zyx125186 commented 5 months ago

I have some questions to want to consult,please! Error in SBayesRC::sbayesrc(mafile = "res_imp.ma",

I have checked the imputed ma file and found no missing values, and it has the same number of bits as the GWAS summary stat file.Is there anything else to overlook?

zhilizheng commented 5 months ago

HI @zyx125186 ,

Thanks for the reports. Could you provide more context? without those, I can't figure out how it could be. We and the our parters run hundreds of traits without problem.

Regards, Zhili

zyx125186 commented 5 months ago

The two questions I raised are both about calculating the locus effects. I am using the beef cattle dataset and have tried five traits (calving ease, body weight, dressing percentage, average daily gain, and weaning weight). Except for calving ease, the other four traits can be calculated. I used the same method to process the calving ease data, but encountered the two issues mentioned above.

zyx125186 commented 5 months ago

The dataset was obtained using the same 770k SNP chip, and has undergone some quality control.

zyx125186 commented 5 months ago

Could it be that the heritability of calving ease is too low to allow for calculation?

zhilizheng commented 5 months ago

HI @zyx125186 ,

Thanks for the information. Some inputs may be not right. I think those traits should be heritable. I don't have a background on animal genome. So the suggestions may be incorrect:

  1. Check the LD. You need caculate the LD in animal. I believe the animal could have more complex LD structure than human. So, you may need to define your correct LD block information. You can also define the whole chromosome as one block, if the data is small, to avoid the LD block definition issue.
  2. Check some of the QC, rare (MAF < 0.01) variants, may cause the model issues.
  3. If the sample size is small, use the BayesRC method, the individual level data may be better here.

Hope it's helpful to you.

Regards, Zhili

zyx125186 commented 5 months ago

Thank you for your suggestion I also want to raise some questions about building LDM. I will use the Generate LD code you provided and follow the refblock format you provided. But there was an issue with step 1 in building LDM. refblock The above is the refblock file I defined myself. After processing with step 1, the result of ldm.info is as follows. step1 I don't quite understand why most of the locus are assigned to NA

zyx125186 commented 5 months ago

Perhaps I think the following situation is correct. This is what I made myself according to the ldm.info format above. true

zyx125186 commented 5 months ago

Thank you very much for your suggestion. I have identified the cause of the two issues I previously raised, mainly due to the presence of NA values in the .ma file. Regarding the issue of constructing the LDM matrix, even though I used some other methods to modify the content of the ldm.info file and other generated files, it means that I did not follow the steps provided in your code for step 1 and step 2. However, everything is functioning normally for step 3 and step 4. Thanks again and all the best!

zhilizheng commented 5 months ago

Hi @zyx125186 ,

Good to know you have figure this out. Sorry for the late response, I was overwhelmed last week. Thanks for the report, I will look into our side and make this robust. Could you figure me an example if you have time?

Regards, Zhili

zyx125186 commented 4 months ago

Hi Zhili, I'm sorry for the late response. I have been using our research group's beef cattle dataset and I have encountered an issue when constructing the LDM (Linkage Disequilibrium Mapping) matrix. Many loci have been categorized as "NA" and I don't quite understand why this has happened. I believe this is incorrect. Later on, I referred to SDPR (A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics) to construct the LDM matrix. In other words, I used SDPR's relevant methods as a substitute for step1 and step2 in constructing the LDM matrix, and I successfully estimated the effect values for each locus. Regards zyx

At 2024-04-12 23:48:28, "Zhili" @.***> wrote:

Hi @zyx125186 ,

Good to know you have figure this out. Sorry for the late response, I was overwhelmed last week. Thanks for the report, I will look into our side and make this robust. Could you figure me an example if you have time?

Regards, Zhili

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

zhilizheng commented 4 months ago

Hi @zyx125186,

Thanks for your reports. I failed to replicate the issue. So I would need the data to replicate the issue.

Could you share me your genotype BIM file (*.bim, which only include the marker information), and also the block definitation file? My email address: zhili[dot]zheng[at]broadinstitute[dot]org. I will delete those data after the debug.

Thanks very much.

Regards, Zhili