choishingwan / PRSice

A software package for calculating, applying, evaluating and plotting the results of polygenic risk scores
http://prsice.info
GNU General Public License v3.0
182 stars 86 forks source link

missing all score #142

Closed jennysjaarda closed 5 years ago

jennysjaarda commented 5 years ago

I am trying to use PRSice to get a PR risk scores at multiple thresholds (very similar to the question here, however my ".all.best" file is not as expected. The first two columns are FID and IID, the third contains a list of PRS (I'm assuming) and the remaining are empty. There are 2003 column headers total and I am trying to load it into R with fread.

I know this issue was apparently resolved in versions after 2.1, but I am running version 2.2.1. The log file is here, if you need the all.score file let me know. 21001_irnt_male.log

choishingwan commented 5 years ago

This shouldn't be a problem with version 2.2.6, can you check that?

jennysjaarda commented 5 years ago

Hello, Sorry for the late reply (I was away on vacation). I tried with the new version and I am still having the same problem with the "all.score" file...

Jenny

On Thu, Aug 15, 2019 at 3:00 PM Shing Wan Choi notifications@github.com wrote:

This shouldn't be a problem with version 2.2.6, can you check that?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVWKP46KCDI2GCOC5JTQEVHODA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4LXSTQ#issuecomment-521632078, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVX56AUB7IGF7LCJSIDQEVHODANCNFSM4IL3WYEQ .

choishingwan commented 5 years ago

I was unable to replicate this problem. Can you confirm the log is correct and show me the header of your .all.score file? And can you get a correct .all.score file when you use the Toy data (the all score file should have 2941 columns).

jennysjaarda commented 5 years ago

Hello,

I attached the most recent log file and a header of the file (first 10 lines). I do get a correct all.score with the toy data. The toy data is with plink data (whereas I am using bgen files), could this have any impact? The only other difference I see is that my association file has already been filtered to only include variants with p< 0.1. However I also tried filtering the ASSOC file in the toy data set and had no issues.

Thanks for your help,

Jenny

On Aug 28, 2019, at 8:06 pm, Shing Wan Choi notifications@github.com wrote:

I was unable to replicate this problem. Can you confirm the log is correct and show me the header of your .all.score file? And can you get a correct .all.score file when you use the Toy data (the all score file should have 2941 columns).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVQLY5Z3RK2XYQLTR7DQG25D3A5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5L7OEY#issuecomment-525858579, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVRNWEKKEKEMKQHAWALQG25D3ANCNFSM4IL3WYEQ.

choishingwan commented 5 years ago

Hi Jenny, unfortunately, I can't see the attachments. Could you please send them to me again? Thank you

jennysjaarda commented 5 years ago

Sorry about that. Does it work this time?

On Aug 30, 2019, at 4:51 pm, Shing Wan Choi notifications@github.com wrote:

Hi Jenny, unfortunately, I can't see the attachments. Could you please send them to me again? Thank you

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVXZ7POSY3CTH2VAZEDQHEXWNA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5R4FNQ#issuecomment-526631606, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVQSGJQ2YJ2PVKRBLATQHEXWNANCNFSM4IL3WYEQ.

choishingwan commented 5 years ago

Unfortunately, still no luck with it. Mind trying to send this via github? Maybe something funny happened when send via email.

jennysjaarda commented 5 years ago

21001_irnt_male.log

21001_irnt_male.all.score.header.txt

choishingwan commented 5 years ago

Got it, perfect. Thank you

jennysjaarda commented 5 years ago

Indeed, a problem with sending via email. Let me know if you need anything else!

choishingwan commented 5 years ago

How'd your 21001_irnt_male.prsice and 21001_irnt_male.best file look like?

I have just test run the --all-score command on my bgen toy data and I still managed to obtain a valid all score file. It's really strange that yours malformed.

jennysjaarda commented 5 years ago

These two files both seem completely fine as far as I can tell... Do you want me to send a header of these results?

I could try running the Toy data but with my bgen data and try and isolate the problem that way...

On Fri, Aug 30, 2019 at 5:20 PM Shing Wan Choi notifications@github.com wrote:

How'd your 21001_irnt_male.prsice and 21001_irnt_male.best file look like?

I have just test run the --all-score command on my bgen toy data and I still managed to obtain a valid all score file. It's really strange that yours malformed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVREFEGCENXNASH2IIDQHE3DTA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5R6WEI#issuecomment-526641937, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVQSBCAQG7TASCMSP2DQHE3DTANCNFSM4IL3WYEQ .

jennysjaarda commented 5 years ago

Hello, I think I may have isolated the problem! My base file has been filtered and clumped to include a set of pruned variants with p<0.1.

When I modify this file in R as follows: base=("my_base_data_file_name") base_data <- read.table(base, header=T) pvals <- runif(n=dim(base_data)[1], min=1e-12, max=.9100) base_data$PVAL <- pvals write.table(base_data, paste0(base, ".temp2"), row.names=F, quote=F, col.names=T)

And rerun PRSice with the $base.temp2 file, everything works completely fine. If I use my original I get an malformed file. So either there is something wrong with my PVAL column in the original file, or the program doesn't like a filtered file. Could there be a problem with scientific notation?

On Mon, Sep 2, 2019 at 10:08 AM Jenny Sjaarda jennysjaarda@gmail.com wrote:

These two files both seem completely fine as far as I can tell... Do you want me to send a header of these results?

I could try running the Toy data but with my bgen data and try and isolate the problem that way...

On Fri, Aug 30, 2019 at 5:20 PM Shing Wan Choi notifications@github.com wrote:

How'd your 21001_irnt_male.prsice and 21001_irnt_male.best file look like?

I have just test run the --all-score command on my bgen toy data and I still managed to obtain a valid all score file. It's really strange that yours malformed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVREFEGCENXNASH2IIDQHE3DTA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5R6WEI#issuecomment-526641937, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVQSBCAQG7TASCMSP2DQHE3DTANCNFSM4IL3WYEQ .

jennysjaarda commented 5 years ago

Attached is my original base file, in case it's useful. male_GRS_0.1.txt

choishingwan commented 5 years ago

Could you have a try with this and see if this solved the problem?

https://www.dropbox.com/s/diof1hlwi45hetw/PRSice?dl=0

jennysjaarda commented 5 years ago

Yes- it seems to be working with this version!

On Tue, Sep 3, 2019 at 5:39 PM Shing Wan Choi notifications@github.com wrote:

Could you have a try with this and see if this solved the problem?

https://www.dropbox.com/s/diof1hlwi45hetw/PRSice?dl=0

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVWOBCXKMRNPO5UODEDQH2AMTA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5YT47Y#issuecomment-527515263, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVRZDVIINOS6WOCWRVTQH2AMTANCNFSM4IL3WYEQ .

choishingwan commented 5 years ago

That's great!

jennysjaarda commented 5 years ago

Out of curiousity will this be resolved in the latest release or should I just continue to use this version you provided at the Dropbox link?

On Sep 4, 2019, at 15:19, Shing Wan Choi notifications@github.com wrote:

Closed #142.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

choishingwan commented 5 years ago

It will be included in the next release (the executable you got is a work in progress). There are still some extra function / features that needed more work before I will release it. However, for now, you can safely use the current version from the dropbox link (the remaining features / functions are all related to speeding up the permutation).

jennysjaarda commented 5 years ago

Sounds good, thanks so much for the help!

Jenny

On Wed, Sep 4, 2019 at 3:30 PM Shing Wan Choi notifications@github.com wrote:

It will be included in the next release (the executable you got is a work in progress). There are still some extra function / features that needed more work before I will release it. However, for now, you can safely use the current version from the dropbox link (the remaining features / functions are all related to speeding up the permutation).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVU76BE5S25MYGCU73DQH62BFA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD53SIDA#issuecomment-527901708, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVWLFHHWWMUEH7ZZUTTQH62BFANCNFSM4IL3WYEQ .

jennysjaarda commented 5 years ago

Hello,

I’m not sure if this is a problem with the version you sent me via dropbox (I suspect not since I am still using the same .R file), but I am now having a problem generating the plots, I get the following error:

Error: Error: None of the phenotype is identified in phenotype header! Execution halted

However I loaded in the pheno file as you did in your script: ‘ header <- read.table(argv$pheno_file, nrows = 1, header = TRUE, check.names=FALSE)

This will automatically filter out un-used phenos

valid.pheno <- phenos %in% colnames(header)

‘ And the pheno seems to be there… any ideas?

Thanks, Jenny

On Sep 4, 2019, at 3:36 pm, Jenny Sjaarda jennysjaarda@gmail.com wrote:

Sounds good, thanks so much for the help!

Jenny

On Wed, Sep 4, 2019 at 3:30 PM Shing Wan Choi <notifications@github.com mailto:notifications@github.com> wrote: It will be included in the next release (the executable you got is a work in progress). There are still some extra function / features that needed more work before I will release it. However, for now, you can safely use the current version from the dropbox link (the remaining features / functions are all related to speeding up the permutation). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/142?email_source=notifications&email_token=AITJSVU76BE5S25MYGCU73DQH62BFA5CNFSM4IL3WYE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD53SIDA#issuecomment-527901708, or mute the thread https://github.com/notifications/unsubscribe-auth/AITJSVWLFHHWWMUEH7ZZUTTQH62BFANCNFSM4IL3WYEQ.

choishingwan commented 5 years ago

Could you please send me the full log? Thanks -- Dr Shing Wan Choi Postdoctoral Fellow Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

jennysjaarda commented 5 years ago

100720_male.log Sorry here it is, however I just solved the problem. I had used a bash variable with the --pheno flag and the variable had quotes surrounding it. In case anyone else has the problem, you can use the following to solve the problem:

pheno_var="\"XXX\""
temp="${pheno_var%\"}"
temp="${temp#\"}"
pheno_var=$temp
choishingwan commented 5 years ago

Great! (I've updated your code block so that it shows the escape sequence more clearly)