snipe-bio / explore

https://snipe-bio.github.io/explore/
0 stars 0 forks source link

Missing samples when visualizing all samples #1

Open drtamermansour opened 1 month ago

drtamermansour commented 1 month ago

I can't find a reason for filtering out these samples

drtamermansour commented 1 month ago

We have SAMN11079552_SRX5487588.sig among the sketches but searching by the experiment accession gives No matching data points found.

mr-eyes commented 1 month ago

We have SAMN11079552_SRX5487588.sig among the sketches but searching by the experiment accession gives No matching data points found.

The sigs I sent you are the downloaded sigs from sra, not the final sigs included in the snipe explore and manuscript. Check the sample filtration report.

drtamermansour commented 1 month ago

Where is this report?

drtamermansour commented 1 month ago

I can't find a reason for filtering out these samples

  • Missing samples in PRJEB31756: ERX3253126
  • Missing samples in PRJNA338327: SRX2010868
  • Missing samples in PRJNA407973: It should have 50 samples, only 40 displayed
  • Missing whole project (PRJNA882864)

The first 3 are found the sample filtration report. The 4th one is published July 2024 and thus missing from our current dataset

drtamermansour commented 1 month ago

All the experiments in the sample filtration report have signatures in our dataset (expect the 54 RNA samples) which means that these samples are just filtered from the visualization which is GREAT!

However, based on the targeted search phrase, the run selector has 19860 runs. The current signatures are missing 208 runs for no obvious reasons (not reported in the sample filtration report). I think thay may failed the sketching step/ (txid9615[organism]) AND "illumina"[Platform] AND ("wgs"[Strategy] OR "wxs"[Strategy]) AND ("genomic"[Source] OR "metagenomic"[Source]) AND ("2000/01/01"[PDAT]: "2023/12/25"[PDAT])

Here they are:

Run,Assay Type,BioProject,BioSample,Experiment,LibrarySource,sample_type DRR001566,WGS,PRJDB2266,SAMD00009664,DRX001090,GENOMIC, ERR019093,WGS,PRJEB2162,SAMEA676114,ERX007596,GENOMIC, ERR10879384,WGS,PRJEB16012,SAMEA112638849,ERX10324146,GENOMIC, ERR10879385,WGS,PRJEB16012,SAMEA112638850,ERX10324147,GENOMIC, ERR10879386,WGS,PRJEB16012,SAMEA112638851,ERX10324148,GENOMIC, ERR10879387,WGS,PRJEB16012,SAMEA112638852,ERX10324149,GENOMIC, ERR10879388,WGS,PRJEB16012,SAMEA112638853,ERX10324150,GENOMIC, ERR10879389,WGS,PRJEB16012,SAMEA112638854,ERX10324151,GENOMIC, ERR10879390,WGS,PRJEB16012,SAMEA112638855,ERX10324152,GENOMIC, ERR10879391,WGS,PRJEB16012,SAMEA112638856,ERX10324153,GENOMIC, ERR10879392,WGS,PRJEB16012,SAMEA112638857,ERX10324154,GENOMIC, ERR10879393,WGS,PRJEB16012,SAMEA112638858,ERX10324155,GENOMIC, ERR10879394,WGS,PRJEB16012,SAMEA112638859,ERX10324156,GENOMIC, ERR10879395,WGS,PRJEB16012,SAMEA112638860,ERX10324157,GENOMIC, ERR10879396,WGS,PRJEB16012,SAMEA112638861,ERX10324158,GENOMIC, ERR10879398,WGS,PRJEB16012,SAMEA112638862,ERX10324159,GENOMIC, ERR10879399,WGS,PRJEB16012,SAMEA112638863,ERX10324160,GENOMIC, ERR10879400,WGS,PRJEB16012,SAMEA112638864,ERX10324161,GENOMIC, ERR10879401,WGS,PRJEB16012,SAMEA112638865,ERX10324162,GENOMIC, ERR10879402,WGS,PRJEB16012,SAMEA112638866,ERX10324163,GENOMIC, ERR10879403,WGS,PRJEB16012,SAMEA112638867,ERX10324164,GENOMIC, ERR10879404,WGS,PRJEB16012,SAMEA112638868,ERX10324165,GENOMIC, ERR10879405,WGS,PRJEB16012,SAMEA112638869,ERX10324166,GENOMIC, ERR10879406,WGS,PRJEB16012,SAMEA112638870,ERX10324167,GENOMIC, ERR10879407,WGS,PRJEB16012,SAMEA112638871,ERX10324168,GENOMIC, ERR10879408,WGS,PRJEB16012,SAMEA112638872,ERX10324169,GENOMIC, ERR10879409,WGS,PRJEB16012,SAMEA112638873,ERX10324170,GENOMIC, ERR10879410,WGS,PRJEB16012,SAMEA112638874,ERX10324171,GENOMIC, ERR10879411,WGS,PRJEB16012,SAMEA112638875,ERX10324172,GENOMIC, ERR10879412,WGS,PRJEB16012,SAMEA112638876,ERX10324173,GENOMIC, ERR10879413,WGS,PRJEB16012,SAMEA112638877,ERX10324174,GENOMIC, ERR10879414,WGS,PRJEB16012,SAMEA112638878,ERX10324175,GENOMIC, ERR10879415,WGS,PRJEB16012,SAMEA112638879,ERX10324176,GENOMIC, ERR10879416,WGS,PRJEB16012,SAMEA112638880,ERX10324177,GENOMIC, ERR10879417,WGS,PRJEB16012,SAMEA112638881,ERX10324178,GENOMIC, ERR10879418,WGS,PRJEB16012,SAMEA112638882,ERX10324179,GENOMIC, ERR11284568,WGS,PRJEB36029,SAMEA7190222,ERX10692177,GENOMIC, ERR12042708,WGS,PRJEB16012,SAMEA114382012,ERX11425910,GENOMIC, ERR12042722,WGS,PRJEB16012,SAMEA114382026,ERX11425924,GENOMIC, ERR12042723,WGS,PRJEB16012,SAMEA114382027,ERX11425925,GENOMIC, ERR12042726,WGS,PRJEB16012,SAMEA114382030,ERX11425928,GENOMIC, ERR12042727,WGS,PRJEB16012,SAMEA114382031,ERX11425929,GENOMIC, ERR12042728,WGS,PRJEB16012,SAMEA114382032,ERX11425930,GENOMIC, ERR12042729,WGS,PRJEB16012,SAMEA114382033,ERX11425931,GENOMIC, ERR12042730,WGS,PRJEB16012,SAMEA114382034,ERX11425932,GENOMIC, ERR12042733,WGS,PRJEB16012,SAMEA114382037,ERX11425935,GENOMIC, ERR12042734,WGS,PRJEB16012,SAMEA114382038,ERX11425936,GENOMIC, ERR12042735,WGS,PRJEB16012,SAMEA114382039,ERX11425937,GENOMIC, ERR12042737,WGS,PRJEB16012,SAMEA114382041,ERX11425939,GENOMIC, ERR12042739,WGS,PRJEB16012,SAMEA114382043,ERX11425941,GENOMIC, ERR12042742,WGS,PRJEB16012,SAMEA114382046,ERX11425944,GENOMIC, ERR12042744,WGS,PRJEB16012,SAMEA114382048,ERX11425946,GENOMIC, ERR12042745,WGS,PRJEB16012,SAMEA114382049,ERX11425947,GENOMIC, ERR12042746,WGS,PRJEB16012,SAMEA114382050,ERX11425948,GENOMIC, ERR12042748,WGS,PRJEB16012,SAMEA114382052,ERX11425950,GENOMIC, ERR12042749,WGS,PRJEB16012,SAMEA114382053,ERX11425951,GENOMIC, ERR12128356,WXS,PRJEB55864,SAMEA111341937,ERX11504274,GENOMIC, ERR12128395,WXS,PRJEB55864,SAMEA111341914,ERX11504313,GENOMIC, ERR12128453,WXS,PRJEB55865,SAMEA111342006,ERX11504371,GENOMIC, ERR12128549,WXS,PRJEB55865,SAMEA111342023,ERX11504467,GENOMIC, ERR12128558,WXS,PRJEB55865,SAMEA111342038,ERX11504476,GENOMIC, ERR12128579,WXS,PRJEB55865,SAMEA111341978,ERX11504497,GENOMIC, ERR12128582,WXS,PRJEB55865,SAMEA111341981,ERX11504500,GENOMIC, ERR12128588,WXS,PRJEB55865,SAMEA111341990,ERX11504506,GENOMIC, ERR12128594,WXS,PRJEB55865,SAMEA111341994,ERX11504512,GENOMIC, ERR12389295,WGS,PRJEB71384,SAMEA115059303,ERX11765419,GENOMIC, ERR12389296,WGS,PRJEB71384,SAMEA115059304,ERX11765420,GENOMIC, ERR12389297,WGS,PRJEB71384,SAMEA115059305,ERX11765421,GENOMIC, ERR1672230,WXS,PRJEB12081,SAMEA3955983,ERX1742282,GENOMIC, ERR1688115,WGS,PRJEB16012,SAMEA4506890,ERX1757732,GENOMIC, ERR1994752,WGS,PRJEB21089,SAMEA104106509,ERX2054621,GENOMIC, ERR1994753,WGS,PRJEB21089,SAMEA104106510,ERX2054622,GENOMIC, ERR2008787,WGS,PRJEB16012,SAMEA104125122,ERX2068550,GENOMIC, ERR2061041,WGS,PRJEB22026,SAMEA104190270,ERX2120098,GENOMIC, ERR2061048,WGS,PRJEB22026,SAMEA104190272,ERX2120105,GENOMIC, ERR206211,WGS,PRJEB2311,SAMEA1561463,ERX180896,GENOMIC, ERR2196282,WGS,PRJEB17926,SAMEA104389243,ERX2252492,GENOMIC, ERR2516739,WGS,PRJEB16012,SAMEA3444493,ERX2536341,GENOMIC, ERR3153863,WGS,PRJEB28163,SAMEA4825472,ERX3182940,GENOMIC, ERR3687187,WGS,PRJEB16012,SAMEA6249486,ERX3676792,GENOMIC, ERR3687194,WGS,PRJEB16012,SAMEA6249493,ERX3676799,GENOMIC, ERR3687198,WGS,PRJEB16012,SAMEA6249497,ERX3676803,GENOMIC, ERR3687200,WGS,PRJEB16012,SAMEA6249499,ERX3676805,GENOMIC, ERR410265,WGS,PRJEB2162,SAMEA1706024,ERX376626,GENOMIC, ERR7251998,WGS,PRJEB16012,SAMEA10644715,ERX6821454,GENOMIC, ERR7251999,WGS,PRJEB16012,SAMEA10644716,ERX6821455,GENOMIC, ERR7252000,WGS,PRJEB16012,SAMEA10644717,ERX6821456,GENOMIC, ERR7252001,WGS,PRJEB16012,SAMEA10644718,ERX6821457,GENOMIC, ERR7252002,WGS,PRJEB16012,SAMEA10644719,ERX6821458,GENOMIC, ERR7252003,WGS,PRJEB16012,SAMEA10644720,ERX6821459,GENOMIC, ERR7252004,WGS,PRJEB16012,SAMEA10644721,ERX6821460,GENOMIC, ERR7252005,WGS,PRJEB16012,SAMEA10644722,ERX6821461,GENOMIC, ERR7252006,WGS,PRJEB16012,SAMEA10644723,ERX6821462,GENOMIC, ERR7252007,WGS,PRJEB16012,SAMEA10644724,ERX6821463,GENOMIC, ERR7252008,WGS,PRJEB16012,SAMEA10644725,ERX6821464,GENOMIC, ERR7252009,WGS,PRJEB16012,SAMEA10644726,ERX6821465,GENOMIC, ERR7252010,WGS,PRJEB16012,SAMEA10644727,ERX6821466,GENOMIC, ERR7252011,WGS,PRJEB16012,SAMEA10644728,ERX6821467,GENOMIC, ERR7252012,WGS,PRJEB16012,SAMEA10644729,ERX6821468,GENOMIC, ERR7252013,WGS,PRJEB16012,SAMEA10644730,ERX6821469,GENOMIC, ERR7252014,WGS,PRJEB16012,SAMEA10644731,ERX6821470,GENOMIC, ERR7252015,WGS,PRJEB16012,SAMEA10644732,ERX6821471,GENOMIC, ERR7252016,WGS,PRJEB16012,SAMEA10644733,ERX6821472,GENOMIC, ERR7252017,WGS,PRJEB16012,SAMEA10644734,ERX6821473,GENOMIC, ERR7252018,WGS,PRJEB16012,SAMEA10644736,ERX6821474,GENOMIC, ERR7252019,WGS,PRJEB16012,SAMEA10644737,ERX6821475,GENOMIC, ERR7252020,WGS,PRJEB16012,SAMEA10644738,ERX6821476,GENOMIC, ERR7252021,WGS,PRJEB16012,SAMEA10644739,ERX6821477,GENOMIC, ERR7252022,WGS,PRJEB16012,SAMEA10644740,ERX6821478,GENOMIC, ERR7252023,WGS,PRJEB16012,SAMEA10644741,ERX6821479,GENOMIC, ERR7252024,WGS,PRJEB16012,SAMEA10644742,ERX6821480,GENOMIC, ERR7252025,WGS,PRJEB16012,SAMEA10644743,ERX6821481,GENOMIC, ERR7252026,WGS,PRJEB16012,SAMEA10644744,ERX6821482,GENOMIC, ERR7252027,WGS,PRJEB16012,SAMEA10644745,ERX6821483,GENOMIC, ERR7252028,WGS,PRJEB16012,SAMEA10644746,ERX6821484,GENOMIC, ERR7252029,WGS,PRJEB16012,SAMEA10644747,ERX6821485,GENOMIC, ERR7252030,WGS,PRJEB16012,SAMEA10644748,ERX6821486,GENOMIC, ERR7252031,WGS,PRJEB16012,SAMEA10644749,ERX6821487,GENOMIC, ERR7252032,WGS,PRJEB16012,SAMEA10644750,ERX6821488,GENOMIC, ERR7530913,WGS,PRJEB16012,SAMEA10833662,ERX7101609,GENOMIC, ERR7530914,WGS,PRJEB16012,SAMEA10833663,ERX7101610,GENOMIC, ERR7530915,WGS,PRJEB16012,SAMEA10833664,ERX7101611,GENOMIC, ERR870949,WXS,PRJEB7540,SAMEA2706839,ERX950550,GENOMIC, ERR871051,WXS,PRJEB7540,SAMEA2706845,ERX950652,GENOMIC, ERR871110,WXS,PRJEB7540,SAMEA2706904,ERX950711,GENOMIC, ERR871145,WXS,PRJEB7540,SAMEA2706843,ERX950746,GENOMIC, ERR871206,WXS,PRJEB7540,SAMEA2706904,ERX950807,GENOMIC, ERR9872050,WGS,PRJEB16012,SAMEA110175941,ERX9417282,GENOMIC, ERR9872051,WGS,PRJEB16012,SAMEA110175942,ERX9417283,GENOMIC, ERR9872052,WGS,PRJEB16012,SAMEA110175943,ERX9417284,GENOMIC, ERR9872053,WGS,PRJEB16012,SAMEA110175944,ERX9417285,GENOMIC, ERR9872054,WGS,PRJEB16012,SAMEA110175945,ERX9417286,GENOMIC, ERR9872055,WGS,PRJEB16012,SAMEA110175946,ERX9417287,GENOMIC, ERR9872056,WGS,PRJEB16012,SAMEA110175947,ERX9417288,GENOMIC, ERR9872057,WGS,PRJEB16012,SAMEA110175948,ERX9417289,GENOMIC, ERR9872058,WGS,PRJEB16012,SAMEA110175949,ERX9417290,GENOMIC, ERR9872059,WGS,PRJEB16012,SAMEA110175950,ERX9417291,GENOMIC, ERR9872060,WGS,PRJEB16012,SAMEA110175951,ERX9417292,GENOMIC, ERR9872061,WGS,PRJEB16012,SAMEA110175952,ERX9417293,GENOMIC, ERR9872062,WGS,PRJEB16012,SAMEA110175953,ERX9417294,GENOMIC, ERR9872063,WGS,PRJEB16012,SAMEA110175954,ERX9417295,GENOMIC, ERR9872064,WGS,PRJEB16012,SAMEA110175955,ERX9417296,GENOMIC, ERR9872065,WGS,PRJEB16012,SAMEA110175956,ERX9417297,GENOMIC, ERR9872066,WGS,PRJEB16012,SAMEA110175523,ERX9417298,GENOMIC, ERR9872067,WGS,PRJEB16012,SAMEA110175524,ERX9417299,GENOMIC, ERR9872068,WGS,PRJEB16012,SAMEA110175525,ERX9417300,GENOMIC, ERR9872069,WGS,PRJEB16012,SAMEA110175526,ERX9417301,GENOMIC, ERR9872070,WGS,PRJEB16012,SAMEA110175527,ERX9417302,GENOMIC, ERR9872071,WGS,PRJEB16012,SAMEA110175528,ERX9417303,GENOMIC, ERR9872072,WGS,PRJEB16012,SAMEA110175529,ERX9417304,GENOMIC, ERR9872073,WGS,PRJEB16012,SAMEA110175530,ERX9417305,GENOMIC, ERR9872074,WGS,PRJEB16012,SAMEA110175531,ERX9417306,GENOMIC, ERR9872075,WGS,PRJEB16012,SAMEA110175532,ERX9417307,GENOMIC, ERR9872076,WGS,PRJEB16012,SAMEA110175533,ERX9417308,GENOMIC, ERR9872077,WGS,PRJEB16012,SAMEA110175534,ERX9417309,GENOMIC, ERR9872078,WGS,PRJEB16012,SAMEA110175535,ERX9417310,GENOMIC, ERR9872079,WGS,PRJEB16012,SAMEA110175536,ERX9417311,GENOMIC, ERR9872080,WGS,PRJEB16012,SAMEA110175537,ERX9417312,GENOMIC, ERR9872081,WGS,PRJEB16012,SAMEA110175538,ERX9417313,GENOMIC, ERR9872082,WGS,PRJEB16012,SAMEA110175539,ERX9417314,GENOMIC, ERR9872083,WGS,PRJEB16012,SAMEA110175540,ERX9417315,GENOMIC, ERR9872084,WGS,PRJEB16012,SAMEA110175541,ERX9417316,GENOMIC, ERR9872085,WGS,PRJEB16012,SAMEA110175542,ERX9417317,GENOMIC, ERR9872086,WGS,PRJEB16012,SAMEA110175543,ERX9417318,GENOMIC, SRR18398330,WGS,PRJNA816174,SAMN26850680,SRX14532237,GENOMIC, SRR18398332,WGS,PRJNA816174,SAMN26850678,SRX14532235,GENOMIC, SRR18398343,WGS,PRJNA816174,SAMN26850691,SRX14532224,GENOMIC, SRR18398351,WGS,PRJNA816174,SAMN26850675,SRX14532216,GENOMIC, SRR18637676,WGS,PRJNA823593,SAMN27307438,SRX14744599,GENOMIC, SRR2016114,WGS,PRJNA263947,SAMN03580379,SRX1022220,GENOMIC, SRR2016116,WGS,PRJNA263947,SAMN03580379,SRX1022222,GENOMIC, SRR2016118,WGS,PRJNA263947,SAMN03580380,SRX1022224,GENOMIC, SRR2016120,WGS,PRJNA263947,SAMN03580380,SRX1022226,GENOMIC, SRR21770665,WGS,PRJNA800779,SAMN25330635,SRX17765696,GENOMIC, SRR21770683,WGS,PRJNA800779,SAMN25331248,SRX17765678,GENOMIC, SRR21770691,WGS,PRJNA800779,SAMN25332909,SRX17765670,GENOMIC, SRR21770703,WGS,PRJNA800779,SAMN25333853,SRX17765658,GENOMIC, SRR21770708,WGS,PRJNA800779,SAMN25330636,SRX17765653,GENOMIC, SRR21770715,WGS,PRJNA800779,SAMN25331352,SRX17765646,GENOMIC, SRR24220295,WGS,PRJNA957464,SAMN34254335,SRX20016709,GENOMIC, SRR2756256,WGS,PRJNA299099,SAMN04195504,SRX1360965,GENOMIC, SRR2827592,WGS,PRJNA266585,SAMN03168374,SRX1386894,GENOMIC, SRR4302047,WGS,PRJNA344694,SAMN05831101,SRX2196351,GENOMIC, SRR5798843,WGS,PRJNA384927,SAMN06893646,SRX2978277,GENOMIC, SRR5798844,WGS,PRJNA384927,SAMN06893647,SRX2978276,GENOMIC, SRR7107656,WXS,PRJNA448733,SAMN02144217,SRX4035893,GENOMIC, SRR7120280,WGS,PRJNA448733,SAMN08873492,SRX4041907,GENOMIC, SRR776307,WGS,PRJNA192935,SAMN01974489,SRX249948,GENOMIC, SRR776429,WGS,PRJNA192935,SAMN01974489,SRX249987,GENOMIC, SRR776430,WGS,PRJNA192935,SAMN01974489,SRX249988,GENOMIC, SRR776431,WGS,PRJNA192935,SAMN01974489,SRX250030,GENOMIC, SRR776435,WGS,PRJNA192935,SAMN01974489,SRX250034,GENOMIC, SRR7780833,WXS,PRJNA489159,SAMN09948259,SRX4636055,GENOMIC, SRR7780840,WXS,PRJNA489159,SAMN09949394,SRX4636048,GENOMIC, SRR782083,WGS,PRJNA192935,SAMN01974494,SRX250954,GENOMIC, SRR782085,WGS,PRJNA192935,SAMN01974495,SRX250956,GENOMIC, SRR867444,WGS,PRJNA203084,SAMN02144217,SRX286473,GENOMIC, SRR867445,WGS,PRJNA203084,SAMN02144217,SRX286473,GENOMIC, SRR870510,WXS,PRJNA203084,SAMN02144215,SRX288295,GENOMIC, SRR870511,WXS,PRJNA203084,SAMN02144215,SRX288295,GENOMIC, SRR870512,WXS,PRJNA203084,SAMN02144219,SRX288297,GENOMIC, SRR870513,WXS,PRJNA203084,SAMN02144219,SRX288297,GENOMIC, SRR870514,WXS,PRJNA203084,SAMN02144217,SRX288296,GENOMIC, SRR870515,WXS,PRJNA203084,SAMN02144217,SRX288296,GENOMIC, SRR870517,WXS,PRJNA203084,SAMN02178578,SRX288298,GENOMIC, SRR870519,WXS,PRJNA203084,SAMN02178578,SRX288298,GENOMIC, SRR870520,WXS,PRJNA203085,SAMN02178576,SRX288294,GENOMIC, SRR870521,WXS,PRJNA203085,SAMN02178576,SRX288294,GENOMIC,