Open jielab opened 5 months ago
No.
On Wed, Jun 12, 2024 at 11:02 PM Jie Huang @.***> wrote:
Hi,
I previously already used Pheweb to process some large GWAS files. Now my project manager decided to rename some of the input GWAS files, for example, renaming a LDL.gwas.gz file to LDL.2023.gwas.gz.
Now if I rerun pheweb, it will think that there is a new file LDL.2023.gwas.gz and begin to re-process it. Is there a way for me to let Pheweb know that some files are renamed so that it won't re-process them?
Thanks!
JH
— Reply to this email directly, view it on GitHub https://github.com/statgen/pheweb/issues/228, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGSPCOVK2TA6J32JWOKDETZHEDVBAVCNFSM6AAAAABJHPY7VOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TAMBTHE4DONQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Oh, actually there is. You get to choose the assoc_files field in pheno-list.json.
Thanks, Peter!
Taking my above example. I renamed a LDL.gwas.gz file to LDL.2023.gwas.gz.
In the pheno-list.json file, if I change LDL.gwas.gz to LDL.2023.gwas.gz in the assoc_files field but keep the phenocode field unchanged, I guess pheweb is smart enough to check the timestamp of LDL.2023.gwas.gz and then determined that it is not a new file and therefore did not re-process it.
A few days later, I got more GWAS data. I always use *pheweb phenolist glob --star-is-phenocode "GWAS-DIR/.gz" to create an updated pheno-list.json file. This time, the updated pheno-list.json file will have a new phenocoe of LDL.2023.gwas**. I guess this time pheweb will re-process it, even though it is still the same GWAS file.
Sorry to answer this seemingly complicated question. I was hoping there is a way to batch update file names at some place, so that my renamed GWAS files don't get re-processed. If there is not an easy solution, i will simply re-process them.
Best regards, JH
You’re understanding of pheweb’s processing sounds correct to me. I don’t understand your exact situation, but your interpretation sounded correct.
On Thu, Jun 13, 2024 at 1:32 AM Jie Huang @.***> wrote:
Thanks, Peter!
Taking my above example. I renamed a LDL.gwas.gz file to LDL.2023.gwas.gz.
In the pheno-list.json file, if I change LDL.gwas.gz to LDL.2023.gwas.gz in the assoc_files field but keep the phenocode field unchanged, I guess pheweb is smart enough to check the timestamp of LDL.2023.gwas.gz and then determined that it is not a new file and therefore did not re-process it.
A few days later, I got more GWAS data. I always use pheweb phenolist glob --star-is-phenocode "GWAS-DIR/.gz" to create an updated pheno-list.json file. This time, the updated pheno-list.json file will have a new phenocoe of LDL.2023.gwas*. I guess this time pheweb will re-process it, even though it is still the same GWAS file.
Sorry to answer this seemingly complicated question. I was hoping there is a way to batch update file names at some place, so that my renamed GWAS files don't get re-processed. If there is not an easy solution, i will simply re-process them.
Best regards, JH
But Pheweb will still re-process this LDL.2023.gwas.gz, because this file does NOT exist in output directories such as generated-by-pheweb/parsed
— Reply to this email directly, view it on GitHub https://github.com/statgen/pheweb/issues/228#issuecomment-2164428982, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGSPCOKCKWSA6H223EEYADZHEVF5AVCNFSM6AAAAABJHPY7VOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRUGQZDQOJYGI . You are receiving this because you commented.Message ID: @.***>
Thanks, Peter!
My situation is: let's say that previously I have 100 GWAS and I run pheweb process on them. It took a few days... Now my group decides to rename those GWAS, for example, adding "2023" or "2024" to the original GWAS names.
In the future, my group will have more GWAS, with names like "2025" or "2026". And I always use *phenolist glob --star-is-phenocode "GWAS-DIR/.gz" to automatically generate and update the pheno-list.json** file.
I am trying to use the new naming system, without spending a few more days to re-processing pheweb for those 100 GWAS.
Anyway, I guess the easiest way is to simply re-process everything, on the renamed GWAS files.
Best regards, JH
Hi,
I previously already used Pheweb to process some large GWAS files. Now my project manager decided to rename some of the input GWAS files, for example, renaming a LDL.gwas.gz file to LDL.2023.gwas.gz.
Now if I rerun pheweb, it will think that there is a new file LDL.2023.gwas.gz and begin to re-process it. Is there a way for me to let Pheweb know that some files are renamed so that it won't re-process them?
Thanks!
JH