cistrome / MIRA

Python package for analysis of multiomic single cell RNA-seq and ATAC-seq.
56 stars 8 forks source link

Zero motif hits when using human FA #3

Closed Chengwei94 closed 2 years ago

Chengwei94 commented 2 years ago

Hi,

I am currently trying to anlayse some human data with mira. When I was running mira.tl.get_motif_hits_in_peaks, I got 0 motif hits. I tried with both the cellranger hg38 genome fa and the ucsc hg38 genome fa and both returned me 0 motifs.

mira.tl.get_motif_hits_in_peaks(atac_main, genome_fasta="hg38.fa", pvalue_threshold=1e-5) # default is p<1e-5 motif_scores = atac_model.get_motif_scores(atac_main)

INFO:mira.tools.motif_scan:Getting peak sequences ... 162985it [00:01, 82902.36it/s] INFO:mira.tools.motif_scan:Scanning peaks for motif hits with p >= 1e-05 ... INFO:mira.tools.motif_scan:Building motif background models ... INFO:mira.tools.motif_scan:Formatting hits matrix ... INFO:mira.adata_interface.regulators:Added key to varm: motifs_hits INFO:mira.adata_interface.regulators:Added key to uns: motifs INFO:mira.adata_interface.topic_model:Fetching key X_topic_compositions from obsm INFO:mira.topic_model.base:Predicting latent variables ... Imputing features: 100%|██████████| 12/12 [00:08<00:00, 1.35it/s]

My atac_main.varm looks like this:

image

Also, what genome file and tss_data should I use for mira.tl.get_distance_to_TSS if I am using a human dataset. Thanks!

AllenWLynch commented 2 years ago

Hi there,

Can you tell me which version and OS of MIRA your are running? Can you also confirm that you provided the correct columns to find the chrom, start, and end of each peak in your adata? If the wrong column headers are provided (as strings to the mira.tl.get_motif_hits_in_peaks method), the peaks may not represent loci and you will not find any motifs.

-Allen


From: Chengwei94 @.> Sent: Sunday, January 16, 2022 9:59 AM To: cistrome/MIRA @.> Cc: Subscribed @.***> Subject: [cistrome/MIRA] Zero motif hits when using human FA (Issue #3)

Hi,

I am currently trying to anlayse some human data with mira. When I was running mira.tl.get_motif_hits_in_peaks, I got 0 motif hits even when I set the p-val to 1. I tried with both the cellranger hg38 genome fa and the ucsc hg38 genome fa and both returned me 0 motifs.

''' mira.tl.get_motif_hits_in_peaks(atac_main, genome_fasta="hg38.fa", pvalue_threshold=1e-5) # default is p<1e-5 motif_scores = atac_model.get_motif_scores(atac_main) '''

INFO:mira.tools.motif_scan:Getting peak sequences ... 162985it [00:01, 82902.36it/s] INFO:mira.tools.motif_scan:Scanning peaks for motif hits with p >= 1e-05 ... INFO:mira.tools.motif_scan:Building motif background models ... INFO:mira.tools.motif_scan:Formatting hits matrix ... INFO:mira.adata_interface.regulators:Added key to varm: motifs_hits INFO:mira.adata_interface.regulators:Added key to uns: motifs INFO:mira.adata_interface.topic_model:Fetching key X_topic_compositions from obsm INFO:mira.topic_model.base:Predicting latent variables ... Imputing features: 100%|██████████| 12/12 [00:08<00:00, 1.35it/s]

— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F3&data=04%7C01%7C%7C8312002d20c845d8f4dc08d9d90934a1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779455819569764%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=oet1paeb1v3q59uJqK7kA7vJUtk0KdmF%2Bun7rg709dY%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPDJ6FEJYHNTMAVACCLUWLTOZANCNFSM5MC5KGHQ&data=04%7C01%7C%7C8312002d20c845d8f4dc08d9d90934a1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779455819569764%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=2eEKnvNUNkBtpM7IL1Icv%2FusrkxLeZHh9b6hZTGed8U%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7C8312002d20c845d8f4dc08d9d90934a1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779455819569764%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=y6Z1xQeP%2FZkmPzzp1cxo6XCM5SeYlSbffVkamov%2FuO8%3D&reserved=0 or Androidhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7C8312002d20c845d8f4dc08d9d90934a1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779455819569764%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=sTwGGxYva8IRjHtgEELE4TBRVYBWLgycJcOfqNw5cSo%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Chengwei94 commented 2 years ago

Hi Allen,

I think I made a stupid mistake of my end and start being the same therefore, no motifs were found. Thanks

Ps: Also, what genome file and tss_data should I use for mira.tl.get_distance_to_TSS if I am using a human dataset. Thanks!

AllenWLynch commented 2 years ago

Hi,

No problem at all. I’m working on documentation and checks that will catch those types of issues automatically in the future.

For human data, I recommend downloading the human canonical TSS locations from the NCBI table browser. That table will include the “official” splice variant information and TSS, and will also usually include one record per gene symbol.

Let me know if this helps,

Allen

On Jan 16, 2022, at 9:28 PM, Chengwei94 @.***> wrote:



Hi Allen,

I think I made a stupid mistake of my end and start being the same therefore, no motifs were found. Thanks

Ps: Also, what genome file and tss_data should I use for mira.tl.get_distance_to_TSS if I am using a human dataset. Thanks!

— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F3%23issuecomment-1014082149&data=04%7C01%7C%7C86d797a279db49c3b7eb08d9d961034b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779832953541593%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=jYsQviLEXcFGEHbqncfg86TXbIpeFTF9bqKOuA25Vsk%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPBHTOODD2JKARGGS4TUWN5D3ANCNFSM5MC5KGHQ&data=04%7C01%7C%7C86d797a279db49c3b7eb08d9d961034b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779832953541593%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=n%2BSvBcEtF4VoRuVV%2FHBcSuSTsCCmixOfSwDlkTKRRzs%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7C86d797a279db49c3b7eb08d9d961034b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779832953541593%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=t4mkcpNrMbVEBoNcu2%2FnWQtZwVxAgWaIlqQK4jWTv3s%3D&reserved=0 or Androidhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7C86d797a279db49c3b7eb08d9d961034b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637779832953697831%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Ac7QTQJHqeaqP1m2pMXnnd9qz9oWxvLyk7bkHWbE02s%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

Chengwei94 commented 2 years ago

Thanks, that sovles my problem.