Closed yojetsharma closed 1 month ago
sorry i am facing the same problem. i just wondering how to fix it?
sorry i am facing the same problem. i just wondering how to fix it?
I solved it by following the pycistopic tutorial where chromsizes file is generated:
chromsizes = pd.read_table(os.path.join(out_dir, "qc", "hg38.chrom_sizes_and_alias.tsv"))
chromsizes
chromsizes.rename({"# ucsc": "Chromosome", "length": "End"}, axis = 1, inplace = True)
chromsizes["Start"] = 0
chromsizes = pr.PyRanges(chromsizes[["Chromosome", "Start", "End"]])
pr_annotation = pd.read_table(
os.path.join(out_dir, "qc", "tss.bed")
).rename(
{"Name": "Gene", "# Chromosome": "Chromosome"}, axis = 1)
pr_annotation["Transcription_Start_Site"] = pr_annotation["Start"]
genome_annotation = pr.PyRanges(pr_annotation)
You may want to use pandas while generating genome annotation file and save it in the /outs folder.
after generating genome_
sorry i am facing the same problem. i just wondering how to fix it?
I solved it by following the pycistopic tutorial where chromsizes file is generated:
chromsizes = pd.read_table(os.path.join(out_dir, "qc", "hg38.chrom_sizes_and_alias.tsv")) chromsizes chromsizes.rename({"# ucsc": "Chromosome", "length": "End"}, axis = 1, inplace = True) chromsizes["Start"] = 0 chromsizes = pr.PyRanges(chromsizes[["Chromosome", "Start", "End"]]) pr_annotation = pd.read_table( os.path.join(out_dir, "qc", "tss.bed") ).rename( {"Name": "Gene", "# Chromosome": "Chromosome"}, axis = 1) pr_annotation["Transcription_Start_Site"] = pr_annotation["Start"] genome_annotation = pr.PyRanges(pr_annotation)
You may want to use pandas while generating genome annotation file and save it in the /outs folder.
After generating chromosize and annotation.tsv manually, how did you modify the config and snakemake file? if i rerun the snakemake, these files are removed and the pipeline tries to download these files again, which leads to an error. how can you skip this download step?
I didn't have to modify those line of codes (in the gene_search_space.py file) but you can try this suggestion in https://github.com/aertslab/scenicplus/issues/357#issuecomment-2064501829 and see if it helps.
They might be getting removed likely because of the format of either the chromsizes or genome_annotation file?
after generating genome_
sorry i am facing the same problem. i just wondering how to fix it?
I solved it by following the pycistopic tutorial where chromsizes file is generated:
chromsizes = pd.read_table(os.path.join(out_dir, "qc", "hg38.chrom_sizes_and_alias.tsv")) chromsizes chromsizes.rename({"# ucsc": "Chromosome", "length": "End"}, axis = 1, inplace = True) chromsizes["Start"] = 0 chromsizes = pr.PyRanges(chromsizes[["Chromosome", "Start", "End"]]) pr_annotation = pd.read_table( os.path.join(out_dir, "qc", "tss.bed") ).rename( {"Name": "Gene", "# Chromosome": "Chromosome"}, axis = 1) pr_annotation["Transcription_Start_Site"] = pr_annotation["Start"] genome_annotation = pr.PyRanges(pr_annotation)
You may want to use pandas while generating genome annotation file and save it in the /outs folder.
After generating chromosize and annotation.tsv manually, how did you modify the config and snakemake file? if i rerun the snakemake, these files are removed and the pipeline tries to download these files again, which leads to an error. how can you skip this download step?
Hey @JinKyu-Cheong @yojetsharma Since tss.bed doesn't provide any genebody length information, I don't think it is a proper solution. FYI I attached my anno file.
I am facing issues with the chromsizes file not getting saved. I have gone through the issues #429 #468 and implemented the solution offered in #357. But to no avail. As suggested in #429 I tried manually downloading these files as follows:
But still get the error that chromosome sizes not found. Then I downloaded the hg38.chrom.sizes file manually and ran the pipeline again but still get this error:
Checking the Chromosomes in the metadata_regions of scplus_obj: