bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
645 stars 343 forks source link

Cell type analysis: IOError: Could not open Cahoy.1.l2.ldscore[./gz/bz2] #147

Open frankwendt5010 opened 5 years ago

frankwendt5010 commented 5 years ago

I am trying to do the cell type analyses following the example with the Cahoy cell types and consistently get the error that Cahoy.1.l2.ldscore[./gz/bz2] cannot be opened. All files are executable, I tried moving the Cahoy.ldcts into the Cahoy_1000Gv3_ldscores directory, triple checked that my path to these files are correct, etc. The same error occurs with all of the available gene sets and I cannot figure out what I am doing incorrectly.

If it helps, I am able to perform the analysis by calling each Cahoy.1., Cahoy.2., and Cahoy.3. cell type individually using the --ref-ld-chr and --h2 flags instead of the --h2-cts approach.

I appreciate any suggestions.

hilaryfinucane commented 5 years ago

Hi, Can you please send the commands you are executing? --ref-ld-chr will take the path to the LD score files that are in the 2nd column of the .ldcts file, e.g. --ref-ld-chr Cahoy_1000Gv3_ldscores/Cahoy.1.,Cahoy_1000Gv3_ldscores/Cahoy.control.,1000G_EUR_Phase3_baseline/baseline.

Best, Hilary

On Wed, Mar 6, 2019 at 3:11 PM frankwendt5010 notifications@github.com wrote:

I am trying to do the cell type analyses following the example with the Cahoy cell types and consistently get the error that Cahoy.1.l2.ldscore[./gz/bz2] cannot be opened. All files are executable, I tried moving the Cahoy.ldcts into the Cahoy_1000Gv3_ldscores directory, triple checked that my path to these files are correct, etc. The same error occurs with all of the available gene sets and I cannot figure out what I am doing incorrectly.

If it helps, I am able to perform the analysis by calling each Cahoy.1., Cahoy.2., and Cahoy.3. cell type individually using the --ref-ld-chr and --h2 flags instead of the --h2-cts approach.

I appreciate any suggestions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bulik/ldsc/issues/147, or mute the thread https://github.com/notifications/unsubscribe-auth/AHSzHKIpwpEMm4s16_faRb_AycLqwC90ks5vUCDXgaJpZM4bhuFE .

frankwendt5010 commented 5 years ago

Hi Hilary,

To clarify, I am able to run partitioned h2 using the paths to each LD score file directly with, for example, --ref-ld-chr /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.control., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.1., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.2.

I am unable to follow the example here https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses, which gives the error described above.

Here is the command I am executing:

python /home/frw5/ldsc/ldsc.py \ --h2-cts /scratch60/frw5/test.sumstats.gz \ --ref-ld-chr /scratch60/frw5/1000G_EUR_Phase3_baseline/baseline. \ --w-ld-chr /scratch60/frw5/weights_hm3_no_hla/weights. \ --ref-ld-chr-cts /scratch60/frw5/Cahoy.ldcts \ --out /scratch60/frw5/test_cahoy

Frank

hilaryfinucane commented 5 years ago

Did you run the whole example script, including wget commands? The second column of Cahoy.ldcts has the path to the LD score files, which is relative to where you are calling ldsc.py from, so if you moved the LD scores around or are calling ldsc from a different directory, that might cause a problem.

Best, Hilary

On Mon, Mar 11, 2019 at 8:43 AM frankwendt5010 notifications@github.com wrote:

Hi Hilary,

To clarify, I am able to run partitioned h2 using the paths to each LD score file directly with, for example, --ref-ld-chr /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.control., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.1., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.2.

I am unable to follow the example here https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses, which gives the error described above.

Here is the command I am executing:

python /home/frw5/ldsc/ldsc.py --h2-cts /scratch60/frw5/test.sumstats.gz --ref-ld-chr /scratch60/frw5/1000G_EUR_Phase3_baseline/baseline. --w-ld-chr /scratch60/frw5/weights_hm3_no_hla/weights. --ref-ld-chr-cts /scratch60/frw5/Cahoy.ldcts --out /scratch60/frw5/test_cahoy

Frank

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bulik/ldsc/issues/147#issuecomment-471523597, or mute the thread https://github.com/notifications/unsubscribe-auth/AHSzHEnThgc34igxpBdUmLTb4ltNtPqkks5vVk9mgaJpZM4bhuFE .

roxyisat-rex commented 4 years ago

Hey Frank

I actually ran into a very similar error while running partitioned heritability. It says: IOError: Could not open Users/ldsc/Celltype1_ldscores/ Celltype_1.1.l2.ldscore[./gz/bz2] Have you sorted your problem? Or have got any advice for me. Because I am 100% certain it is not because of a directory mistake or anything like that... Any advice is welcome! Thank you!

Best regards, Roxy

roxyisat-rex commented 4 years ago

Hi Hilary,

To clarify, I am able to run partitioned h2 using the paths to each LD score file directly with, for example, --ref-ld-chr /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.control., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.1., /scratch60/frw5/Cahoy_1000Gv3_ldscores/Cahoy.2.

I am unable to follow the example here https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses, which gives the error described above.

Here is the command I am executing:

python /home/frw5/ldsc/ldsc.py --h2-cts /scratch60/frw5/test.sumstats.gz --ref-ld-chr /scratch60/frw5/1000G_EUR_Phase3_baseline/baseline. --w-ld-chr /scratch60/frw5/weights_hm3_no_hla/weights. --ref-ld-chr-cts /scratch60/frw5/Cahoy.ldcts --out /scratch60/frw5/test_cahoy

Frank

Hey Frank

I actually ran into a very similar error while running partitioned heritability. It says: IOError: Could not open Users/ldsc/Celltype1_ldscores/ Celltype_1.1.l2.ldscore[./gz/bz2] Have you sorted your problem? Or have got any advice for me. Because I am 100% certain it is not because of a directory mistake or anything like that... Any advice is welcome! Thank you!

Best regards, Roxy

Sorry for posting again, I realised I didn't reply to you earlier.

mkoromina commented 2 years ago

Hi to all, I ran into exactly the same problem, so any advice/tips on how to be able to fix that are more than welcome!

Best, Maria

maxdudek commented 1 year ago

Also running into this problem!

Here's my command:

python $LDSC/ldsc.py \
  --h2 $SUM_STATS \
  --ref-ld-chr ldscore/HepG2_new_ocr_pir_overlap.,$LDSR_1K/BaselineLD_1kg_phase3_ldscore/baselineLD. \
  --w-ld-chr $LDSR_1K/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC. \
  --overlap-annot \
  --frqfile-chr $LDSR_1K/1000G_Phase3_frq/1000G.EUR.QC. \
  --out out/HepG2_new_ocr_pir_overlap

Error:

IOError: Could not open ldscore/HepG2_new_ocr_pir_overlap.1.l2.ldscore[./gz/bz2]

And what the file looks like, from the directory I run LDSC from:

>$ gunzip < ld_score/HepG2_new_ocr_pir_overlap.1.l2.ldscore.gz | head
CHR     SNP     BP      L2
1       rs3131972       752721  0.058
1       rs3131969       754182  0.098
1       rs3131967       754334  0.098
1       rs1048488       760912  0.055
1       rs12562034      768448  -0.009
1       rs12124819      776546  0.054
1       rs4040617       779322  0.087
1       rs2905036       792480  0.444
1       rs4970383       838555  0.125

@mkoromina @roxyisat-rex @frankwendt5010

If any of you were able to solve this please let me know what the issue was! Any help would be appreciated.

mkoromina commented 1 year ago

Hi @maxdudek ,

What I ended up doing is using another version of ldsc.py (used the version from Polyfun software) and then running the script as described here: https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses

ldsc.py \
    --h2-cts UKBB_BMI.sumstats.gz \
    --ref-ld-chr 1000G_EUR_Phase3_baseline/baseline. \
    --out BMI_${cts_name} \
    --ref-ld-chr-cts $cts_name.ldcts \
    --w-ld-chr weights_hm3_no_hla/weights.

where ldsc.py is the python script that I found within the scripts of the Polyfun software.

Hope this works for you!

maxdudek commented 1 year ago

Hi @mkoromina,

Thanks for the reply! I actually just realized today that I forgot the underscore in ld_score when specifying the input directory, and that was the issue, apologies for my confusion.