broadinstitute / gtex-pipeline

GTEx & TOPMed data production and analysis pipelines
BSD 3-Clause "New" or "Revised" License
343 stars 175 forks source link

error with eqtl_expression.py #87

Closed jfertaj closed 3 months ago

jfertaj commented 1 year ago

Hi all,

I am trying to run eqtl_expression.py script using the transcript and tpm gcts but I got an error when merging the normalised bed file and the bed file created from the gtf.

here is the error:

Loading expression data
Normalizing data (tmm)
  * 199324 genes in input tables.
  * 97619 genes remain after thresholding.
Traceback (most recent call last):
  File "/src/eqtl_prepare_expression.py", line 189, in <module>
    norm_bed_df = prepare_bed(norm_df, bed_template_df, chr_subset=chr_list)
  File "/src/eqtl_prepare_expression.py", line 57, in prepare_bed
    bed_df = bed_df[bed_df.chr.isin(chr_subset)]
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 4376, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'chr'

This is my command: I am using the docker version through singularity:

# sigmoid
singularity run --bind ${nagore}:/data ${juandir}/gtex_qtl_v8.img \
    /bin/bash -c "python3 /src/eqtl_prepare_expression.py \
    /data/${tpms_gct_sigmoid} /data/${counts_gct_sigmoid} /data/${annotation_gtf} \
    /data/${sigmoid_lookup} /data/${vcf_chr_list} ${prefix_sigmoid} \
    --tpm_threshold 0.1 \
    --count_threshold 3 \
    --sample_frac_threshold 0.2 \
    --normalization_method tmm"
francois-a commented 9 months ago

Thanks for reporting this, the error is due to a change in pandas groupby behavior. Fixed in 6b4e25d4d0f40e1e77b0888fa78ca434514f9339.