Open LliliansCalvo opened 4 years ago
Hi, I just took a quick look and that is not currently possible but the code should be pretty easy to amend so that it can accept any header (this code was written 7 years ago for a specific project, unfortunately when I was a graduate student I didn't have the foresight to make it more generalizable!). If you look in parseAll.py you should be able to find where it can be updated so that's no longer a problem.
If you need help with updating the python code, I'm not available to fix this currently but if you give me 1-2 days I'll try to take a look.
Hi, Thats very kind of you.
So I have been trying to “cheat” and add to my sequences what they need to pass that filter and that worked. But I am still getting new errors. My files look like this:
--seqf
phaw_refGene_TRINITY_GG_86181_c0_g1_i4 strand=+ GAATAGGGCACTCGTGCCACTAGACCCCAACTGCAGCGAGGATGAAGCAGAGGACGACGA
--exprf TRINITY_GG_24814_c0_g1_i1 -2.50388609 TRINITY_GG_143598_c0_g1_i4 -2.504865179 TRINITY_GG_9206_c1_g1_i5 -2.50968568
--mirf
miR-92|LQNS02278089.1_34108_3p Parhyale hawaiensis 34108_3p AATTGCACTCGTCCCGGCCTGC miR-92|LQNS02278089.1_34106_3p Parhyale hawaiensis 34106_3p AATTGCACTGATCCCGGCCTGC miR-92|LQNS02278089.1_34110_3p Parhyale hawaiensis 34110_3p AATTGCACTCGTCCCGGCCTTC miR-184|LQNS02000211.1_1952_3p Parhyale hawaiensis 1952_3p TGGACGGAGAACTGATAAGGGC miR-184|LQNS02000211.1_1954_3p Parhyale hawaiensis 1954_3p TGGACGGAGAACTGATAAGGGC
At this point I do not understand why it wouldn’t work still.
error Solving MLM with GEMMA Parsing files for PLINK Removing 4096 motifs with no information /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:392: RuntimeWarning: Mean of empty slice. avg = a.mean(axis) /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/core/_methods.py:78: RuntimeWarning: invalid value encountered in true_divide ret, rcount, out=ret, casting='unsafe', subok=False) /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:2522: RuntimeWarning: Degrees of freedom <= 0 for slice c = cov(x, y, rowvar) /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:2451: RuntimeWarning: divide by zero encountered in true_divide c = np.true_divide(1, fact) /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:2451: RuntimeWarning: invalid value encountered in multiply c = np.true_divide(1, fact) Traceback (most recent call last): File "/mnt/fls01-home01/mqbpwlc2/privatemodules/MixMir/MixMir.py", line 157, in
doAll(doKin=doKin) File "/mnt/fls01-home01/mqbpwlc2/privatemodules/MixMir/MixMir.py", line 18, in doAll runParse(doKin=doKin) File "/mnt/fls01-home01/mqbpwlc2/privatemodules/MixMir/MixMir.py", line 43, in runParse parseAll.doAll(doKin=doKin,seqf=seqf,exprf=exprf,outfnkin=kinf,outPedFile=outPedFile,outMapFile=outMapFile,kkin=kkin,kmotif=kmotif,frac=frac,useFast=useFast) File "/mnt/fls01-home01/mqbpwlc2/privatemodules/MixMir/parseAll.py", line 64, in doAll makeKin(dcounts=kin_dcounts,genes=genes,outfn=outfnkin,useFast=useFast) File "/mnt/fls01-home01/mqbpwlc2/privatemodules/MixMir/parseAll.py", line 247, in makeKin np.savetxt(outfn,K,delimiter='\t',fmt='%.4f') File "/mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1377, in savetxt "Expected 1D or 2D array, got %dD array instead" % X.ndim) ValueError: Expected 1D or 2D array, got 0D array instead
On 12 Oct 2020, at 16:01, ldiao notifications@github.com<mailto:notifications@github.com> wrote:
Hi, I just took a quick look and that is not currently possible but the code should be pretty easy to amend so that it can accept any header (this code was written 7 years ago for a specific project, unfortunately when I was a graduate student I didn't have the foresight to make it more generalizable!). If you look in parseAll.py you should be able to find where it can be updated so that's no longer a problem.
If you need help with updating the python code, I'm not available to fix this currently but if you give me 1-2 days I'll try to take a look.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ldiao/MixMir/issues/6#issuecomment-707173582, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH46ZMWIHQOOAJ56YJB4AOLSKMK4HANCNFSM4SM4GQGA.
How many motifs are you trying to assess? Is it 4096? For some reason that number was removed the rest of the downstream analysis.
That's odd, I do not understand where that number is coming from. However, since I thought it could be an ID problem I have now changed all the IDs so they look similar to your example data however is still not running but it does run with the test/data Any idea what could this be due to? Thanks !
mm10_refGene_NM_000326734 strand=+ mm10_refGene_NM_000341290 strand=+ mm10_refGene_NM_000379325 strand=+ mm10_refGene_NM_000379317 strand=+ mm10_refGene_NM_000378785 strand=+
NM_000326734 -2.636433599 NM_000341290 -4.948907879 NM_000379325 -6.665454548 NM_000379317 -3.25614215
python MixMir.py --seqf runs/g.ens_IDs.e.3UTR_LC2vsLC3_resSig_down.fa --exprf runs/d.ens_downreg_rank.txt --mirf testdat/testmirs.fa --k_kin 6 --k_motif 6 --N 20 --fast 0 --out testdat/test
Solving MLM with GEMMA
Parsing files for PLINK
Traceback (most recent call last):
File "MixMir.py", line 157, in
Is your expression file tab delimited or space delimited? (should be tab)
Just checking in here--were you able to get your data set to run with tab delimited input data?
After changing to tab delimited only one new error left. It will compute fine but the table is looking like this:
Solving MLM with GEMMA Parsing files for PLINK Removing 0 motifs with no information /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:2530: RuntimeWarning: invalid value encountered in true_divide c /= stddev[:, None] /mnt/fls01-home01/mqbpwlc2/gridware/share/python/2.7.8/lib/python2.7/site-packages/numpy/lib/function_base.py:2531: RuntimeWarning: invalid value encountered in true_divide c /= stddev[None, :]
Rank Motif P-value P-value (Bonf) Coef NUTRs miRNAs Matched
1 TTTTTT nan nan nan 1716 [A1]miR-LQNS02278075-1-32324-5p
2 GTTTTT nan nan nan 1467
3 ATTTTT nan nan nan 1857
4 CTTTTT nan nan nan 1465
5 TGTTTT nan nan nan 1483 [A1]miR-bantam
6 GGTTTT nan nan nan 1033 [3]miR-981, [3]miR-981, [3]miR-981
7 AGTTTT nan nan nan 1613
Hi, I am trying to ru MixMir, now working with your testdata, but I get this error with my own data. ('No refseq ID detected', '>TRINITY_GG_86181_c0_g1_i4') Can MixMir be run with my own fasta file regardless of its headers? fasta headers example: