Closed Zhangruiqi111 closed 4 months ago
Hi @Zhangruiqi111
You can download it using the following command:
scenicplus prepare_data download_genome_annotations
usage: scenicplus prepare_data download_genome_annotations [-h] --species SPECIES --genome_annotation_out_fname
GENOME_ANNOTATION_OUT_FNAME --chromsizes_out_fname CHROMSIZES_OUT_FNAME
[--biomart_host BIOMART_HOST] [--do_not_use_ucsc_chromosome_style]
scenicplus prepare_data download_genome_annotations: error: the following arguments are required: --species, --genome_annotation_out_fname, --chromsizes_out_fname
Best,
Seppe
Thank you for your reply!I want to know what options can be followed by the parameter "--species" ?
For the motif enrichment step it should be "homo_spapiens"
and for the download_genome_annotations it should be "hsapiens"
. Sorry for the inconsistencies.
Best,
Seppe
Ok, thank you very much!
Best,
Ruiqi Zhang
(scenicplus) [yojetsharma@pakeeza outs]$ scenicplus prepare_data download_genome_annotations \
> --species "hsapiens" \
> --genome_annotation_out_fname "/home/praghu/yojetsharma/pycistopic_final/outs/genome_annotation.tsv" \
> --chromsizes_out_fname "/home/praghu/yojetsharma/pycistopic_final/outs/chromsizes.tsv"
2024-10-17 11:29:33,134 Download gene annotation INFO Using genome: GRCh38.p14
Could not find IdList on https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=genome&term=GRCh38.p14
Returning gene annotation without subestting for assembled chromosomesand converting to UCSC style. Please make sure that the chromosome namesin the returned object match with the chromosome names in the scplus_obj.Chromosome sizes will not be returned
2024-10-17 11:29:34,268 SCENIC+ INFO Chrosomome sizes was not found, please provide this information manually.
2024-10-17 11:29:34,269 SCENIC+ INFO Saving genome annotation to: /home/praghu/yojetsharma/pycistopic_final/outs/genome_annotation.tsv
I manually saved the genome_annotation but it gives this chromosome sizes not found error. Then I downloaded the hg38.chrom.sizes file as done in the pycistopic tutorial saved it to the outs/ folder and ran snakemake again. But the pipeline gets stopped.
(scenicplus) [yojetsharma@pakeeza Snakemake]$ snakemake --cores 20
Assuming unrestricted shared filesystem usage for local execution.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 20
Rules claiming more threads will be scaled down.
Job stats:
job count
--------------------------- -------
AUCell_direct 1
AUCell_extended 1
all 1
download_genome_annotations 1
eGRN_direct 1
eGRN_extended 1
get_search_space 1
motif_enrichment_dem 1
prepare_menr 1
region_to_gene 1
scplus_mudata 1
tf_to_gene 1
total 12
Select jobs to execute...
Execute 1 jobs...
[Thu Oct 17 11:45:29 2024]
localrule download_genome_annotations:
output: /home/praghu/yojetsharma/pycistopic_final/outs/genome_annotation.tsv, /home/praghu/yojetsharma/pycistopic_final/outs/chromsizes
jobid: 8
reason: Missing output files: /home/praghu/yojetsharma/pycistopic_final/outs/chromsizes
resources: tmpdir=/tmp
2024-10-17 11:47:11,805 Download gene annotation INFO Using genome: GRCh38.p14
Could not find IdList on https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=genome&term=GRCh38.p14
Returning gene annotation without subestting for assembled chromosomesand converting to UCSC style. Please make sure that the chromosome namesin the returned object match with the chromosome names in the scplus_obj.Chromosome sizes will not be returned
2024-10-17 11:47:11,816 SCENIC+ INFO Chrosomome sizes was not found, please provide this information manually.
2024-10-17 11:47:11,816 SCENIC+ INFO Saving genome annotation to: /home/praghu/yojetsharma/pycistopic_final/outs/genome_annotation.tsv
Waiting at most 5 seconds for missing files.
MissingOutputException in rule download_genome_annotations in file /ncbs_gs/nlsas_data/usershares/praghu/yojetsharma/pycistopic_final/scplus_pipeline/Snakemake/workflow/Snakefile, line 221:
Job 8 completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
/home/praghu/yojetsharma/pycistopic_final/outs/chromsizes
Removing output files of failed job download_genome_annotations since they might be corrupted:
/home/praghu/yojetsharma/pycistopic_final/outs/genome_annotation.tsv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-17T114523.626937.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Hello,When I run this command,I don‘t know where is genome_annotation.tsv ? Can I download it manually? scenicplus grn_inference motif_enrichment_dem \ --region_set_folder 'outs/region_sets' \ --dem_db_fname '10x_brain_1kb_bg_with_mask.regions_vs_motifs.scores.feather' \ --output_fname_dem_result "dem_results.hdf5" \ --temp_dir "" \ --species "hsapiens" \ --fraction_overlap_w_dem_database 0.4 \ --max_bg_regions 500 \ --balance_number_of_promoters \ --genome_annotation "genome_annotation.tsv"\ --promoter_space 1_000 \ --adjpval_thr 0.05 \ --log2fc_thr 1.0 \ --mean_fg_thr 0.0 \ --motif_hit_thr 3.0 \ --path_to_motif_annotations 'aertslab_motif_colleciton/v10nr_clust_public/snapshots/motifs-v10-nr.mgi-m0.00001-o0.0.tbl' \ --annotation_version 'v10nr_clust' \ --motif_similarity_fdr 0.001 \ --orthologous_identity_threshold 0.0 \ --annotations_to_use "Direct_annot Orthology_annot" \ --write_html \ --output_fname_dem_html "dem_results.html"\ --seed 666