Closed MatthewTCManion closed 3 months ago
Hi @MatthewTCManion
Could you try running the following command in the terminal?
scenicplus grn_inference TF_to_gene \
--multiome_mudata_fname ACC_GEX.h5mu \
--tf_names tf_names.txt \
--temp_dir /data/PetrosLab/Matt/scenicplus/consensus_peak_bulk_750bp/tmp/ \
--out_tf_to_gene_adjacencies tf_to_gene_adj.tsv \
--method GBM \
--n_cpu 20 \
--seed 666
Best,
Seppe
Hi Seppe, having this exact same error now! Upon running your suggested line I get the exact same error message as running the whole SCENIC+ (screenshot attached) (EDIT: just realised this is slightly before in the pipeline (region_to_gene instead of tf_to_gene) (EDIT #2 (solution): downgrading to python 3.11.8 (was 3.11.9 prior) solved all these issues...) @SeppeDeWinter maybe worthwhile specifying the 3.11.8 and not just 3.11 in the tutorials? :)
Hi Seppe, having this exact same error now! Upon running your suggested line I get the exact same error message as running the whole SCENIC+ (screenshot attached) (EDIT: just realised this is slightly before in the pipeline (region_to_gene instead of tf_to_gene) (EDIT #2 (solution): downgrading to python 3.11.8 (was 3.11.9 prior) solved all these issues...) @SeppeDeWinter maybe worthwhile specifying the 3.11.8 and not just 3.11 in the tutorials? :)
I am running it now with the command above, but I had previously tried it with downgrading python 3.11.8 and gotten the same result
@SeppeDeWinter Running the command without Snakemake, I get a segfault, but I can't tell why. Usually that's a resource allocation issue, but the memory use stays firmly under the allocated limit.
UPDATE: I tried it again with a larger CPU allocation and it finished correctly, I will test the rest of the pipeline now
I had a similar exit on eGRN_extended, but it ran fine when I used the command outside of the Snakemake pipeline:
> scenicplus grn_inference eGRN \
--is_extended \
--TF_to_gene_adj_fname tf_to_gene_adj.tsv \
--region_to_gene_adj_fname region_to_gene_adj.tsv \
--cistromes_fname cistromes_extended.h5ad \
--ranking_db_fname /data/PetrosLab/Matt/scenicplus/Nkx_750bp.regions_vs_motifs.rankings.feather \
--eRegulon_out_fname eRegulons_extended.tsv \
--temp_dir /data/PetrosLab/Matt/scenicplus/consensus_peak_bulk_750bp/tmp/ \
--order_regions_to_genes_by importance \
--order_TFs_to_genes_by importance \
--gsea_n_perm 1000 \
--quantiles 0.85 0.90 0.95 \
--top_n_regionTogenes_per_gene 5 10 15 \
--top_n_regionTogenes_per_region \
--min_regions_per_gene 0 \
--rho_threshold 0.05 \
--min_target_genes 10 \
I'm not sure what the issue is, but it appears to be with the snakemake workflow, not the specific steps
UPDATE: the snakemake workflow worked for AUCell_extended after running the previous 2 steps manually, but failed on eGRN_direct:
scenicplus grn_inference eGRN \ --TF_to_gene_adj_fname tf_to_gene_adj.tsv \ --region_to_gene_adj_fname region_to_gene_adj.tsv \ --cistromes_fname cistromes_direct.h5ad \ --ranking_db_fname /data/PetrosLab/Matt/scenicplus/Nkx_750bp.regions_vs_motifs.rankings.feather \ --eRegulon_out_fname eRegulon_direct.tsv \ --temp_dir /data/PetrosLab/Matt/scenicplus/consensus_peak_bulk_750bp/tmp/ \ --order_regions_to_genes_by importance \ --order_TFs_to_genes_by importance \ --gsea_n_perm 1000 \ --quantiles 0.85 0.90 0.95 \ --top_n_regionTogenes_per_gene 5 10 15 \ --top_n_regionTogenes_per_region \ --min_regions_per_gene 0 \ --rho_threshold 0.05 \ --min_target_genes 10 \ --n_cpu 20
All steps ran correctly when I set the number of CPUs to 50, it seems it was all just a resource allocation issue.
I am able to rune the SCENIC+ snakemake pipeline up until tf_to_gene, at which point it appears to run for every cell, but then his an unspecified error right after starting "Adding correlation coefficients to adjacencies". It doesn't appear to have any issue with the adjacencies for region_to_gene, so I'm not sure why it would fail here.
I have gotten this same failure multiple times with different resource allocations.
Error log:
Environment: