Open davidpng opened 10 years ago
It looks like a lot of the antigen-fluorophore parsings did not work correctly esp PE-Texas Red:
select TubeTypesInstances.tube_type, antigens, MIN(date) as min_date, MAX(date) as max_date, COUNT() as count from TubeCases INNER JOIN TubeTypesInstances USING (tube_type_instance) group by TubeTypesInstances.tube_type_instance order by count desc limit 40;
tube_type | Antigens | min_date | max_date | count |
---|---|---|---|---|
Myeloid 1 | CD117;CD13;CD15;CD19;CD33;CD34;CD38;CD45;CD71;LA-DR;Unknown | 2005-11-17 14:43:26 | 2013-01-02 09:00:36 | 26862 |
B Cells New | CD10;CD19;CD20;CD38;CD45;CD5;Kappa;Lambda;PE-Texas;Unknown | 2008-10-23 17:08:13 | 2013-01-02 11:19:21 | 25579 |
Myeloid 2 | CD123;CD13;CD14;CD16;CD34;CD38;CD4;CD45;CD64;LA-DR;Unknown | 2006-07-15 11:15:32 | 2013-01-02 09:00:37 | 24020 |
T Cells New | CD2;CD3;CD30;CD34;CD4;CD45;CD5;CD56;CD7;CD8;Unknown | 2007-11-28 16:46:33 | 2013-01-02 11:19:21 | 21070 |
B cells rpt | CD10;CD19;CD20;CD38;CD45;CD5;Kappa;Lambda;Unknown | 2006-02-11 14:40:32 | 2009-08-20 09:22:47 | 18143 |
Myeloid 4 | CD33;CD34;CD38;CD45;CD5;CD56;CD7;PE-Texas;Unknown | 2006-06-14 12:43:37 | 2011-07-15 13:31:13 | 11017 |
T5 | CD2;CD3;CD34;CD4;CD45;CD5;CD56;CD7;CD8;Unknown | 2005-11-17 14:27:54 | 2009-01-24 16:39:45 | 10711 |
Plasma Cell NEW | CD138;CD19;CD38;CD45;CD56;DAPI;PE-Texas;Unknown;cyto | 2008-12-12 13:12:25 | 2012-12-31 15:37:12 | 5157 |
NEWa | CD10;CD19;CD20;CD38;CD45;CD58;PE-Texas;Unknown | 2006-04-19 17:33:19 | 2011-12-31 16:09:56 | 4239 |
COG B | CD10;CD13+33;CD19;CD34;CD45;CD9;PE-Texas;Unknown | 2006-12-12 15:25:16 | 2012-07-06 13:39:38 | 4232 |
Plasma Cells NEW | CD19;CD38;CD45;CD56;DAPI;PE-Texas;Unknown;cyto | 2005-11-18 14:32:51 | 2009-03-03 18:48:58 | 3963 |
WBC | CD34;CD45;CD71;PE-Texas;Unknown | 2005-11-18 11:06:42 | 2008-02-15 11:33:17 | 3643 |
Myeloid 4 | CD33;CD34;CD38;CD45;CD5;CD56;CD7;PE-Texas;Pacific;Unknown | 2011-07-13 15:18:31 | 2012-12-31 16:51:35 | 3047 |
B-ALL | CD10;CD19;CD20;CD34;CD38;CD45;CD58;PE-Texas;Unknown | 2008-11-20 13:15:52 | 2012-07-06 13:40:41 | 2905 |
D | CD19;CD3;CD45;CD71;PE-Texas;SYTO16;Unknown | 2006-12-12 15:28:46 | 2012-07-06 13:40:10 | 2698 |
B ALL MRD | CD10;CD19;CD20;CD34;CD38;CD45;CD58;PE-Texas;Pacific;Unknown | 2011-07-15 12:41:02 | 2013-01-02 08:54:03 | 2119 |
B ALL MRD | CD10;CD19;CD20;CD34;CD38;CD45;CD58;Unknown | 2005-11-19 12:59:18 | 2008-11-20 17:54:51 | 1927 |
T4 | CD16;CD3;CD38;CD45;CD5;CD56;CD7;PE-Texas;Unknown;cCD3 | 2007-12-17 14:40:18 | 2012-12-28 17:49:33 | 1743 |
PNH | CD14;CD33;CD45;CD66b;PE-Texas;Unknown | 2005-11-18 15:28:36 | 2010-07-31 12:38:50 | 1337 |
WBC | CD34;CD45;CD71;DRAQ5;PE-Texas;Unknown | 2005-11-17 14:42:56 | 2010-05-20 16:10:43 | 1245 |
addon | CD10;CD19;CD20;CD38;CD40;CD45;CD5;Kappa;Lambda;Unknown | 2005-11-17 14:27:31 | 2010-02-18 11:46:35 | 1216 |
Myeloid 2 | CD123;CD13;CD14;CD16;CD34;CD38;CD45;CD64;LA-DR;Unknown | 2006-02-09 13:38:49 | 2008-03-15 12:49:19 | 1149 |
D8 | CD10;CD19;CD20;CD34;CD45;PE-Texas;SYTO16;Unknown | 2006-12-12 16:30:13 | 2011-12-24 14:06:28 | 1110 |
Other B cell | CD103;CD11c;CD19;CD25;CD45;PE-Texas;Unknown | 2005-11-17 14:58:40 | 2012-12-23 13:03:02 | 1007 |
Bone Marrow WBC | CD16;CD33;CD38;CD45;CD71;PE-Texas;Unknown | 2010-09-08 14:32:27 | 2012-01-17 13:25:08 | 852 |
T3 | CD3;CD45;CD56;CD7;CD71;PE-Texas;SYTO16;Unknown | 2007-01-26 19:21:53 | 2011-12-31 16:11:21 | 847 |
Hodgkin | CD15;CD20;CD30;CD40;CD45;CD5;CD64;CD71;CD95;Unknown | 2006-10-07 14:10:12 | 2013-01-02 14:46:51 | 665 |
NEW PNH WBC | CD14;CD15;CD24;CD45;CD64;FLAER;PE-Texas;Unknown | 2010-07-29 13:14:44 | 2012-12-30 11:26:40 | 650 |
BAL COUNT | 7AAD;GLY;PE-Texas;Pacific;SYTO;Unknown | 2011-10-28 17:03:33 | 2012-12-31 16:49:15 | 625 |
CLL TUBE | CD19;CD200;CD23;CD5;FMC7;Pacific;Unknown | 2011-07-13 17:13:52 | 2012-12-28 11:54:21 | 542 |
Myeloid 2 | CD123;CD13;CD14;CD16;CD34;CD36;CD38;CD45;CD64;LA-DR;Unknown | 2005-11-17 14:43:46 | 2006-02-15 11:16:04 | 475 |
new TdT | CD10;CD38;CD45;CD7;PE-Texas;TdT;Unknown;cCD3 | 2007-06-19 15:08:15 | 2008-06-06 13:47:51 | 444 |
NEW PNH RBC | CD59;GlyA;PE-Texas;Pacific;Unknown | 2011-07-14 11:10:46 | 2012-12-30 11:26:38 | 343 |
New MASTO | CD117;CD2;CD25;CD45;PE-Texas;Pacific;Unknown | 2012-01-17 15:41:09 | 2012-12-28 09:07:40 | 343 |
Neg MASTO | CD117;CD45;PE-Texas;Pacific;Unknown | 2012-01-17 15:41:46 | 2012-12-28 09:07:41 | 340 |
Bcl-2 addon | Bcl2;CD10;CD19;CD20;CD38;CD45;CD5;PE-Texas;Unknown | 2006-06-26 16:27:24 | 2011-07-16 12:02:12 | 317 |
BAL COUNT | 7AAD;GLY;PE-Texas;SYTO;Unknown | 2010-02-02 16:50:06 | 2011-07-11 14:30:13 | 316 |
T5 New | CD16+56;CD2;CD3;CD34;CD4;CD45;CD5;CD7;CD8;Unknown | 2009-01-31 14:50:29 | 2012-05-09 16:31:13 | 313 |
CLL TUBE | CD19;CD200;CD23;CD5;FMC7;Unknown | 2010-12-28 15:40:16 | 2011-07-15 10:30:45 | 291 |
NEW PNH RBC | CD59;GlyA;PE-Texas;Unknown | 2010-05-06 16:42:24 | 2011-07-14 14:54:47 | 284 |
Okay. Let's decide what to clean up and what to document and punt till later.
-Dan
On Nov 15, 2014, at 8:28 AM, David Ng notifications@github.com wrote:
It looks like a lot of the antigen-fluorophore parsings did not work correctly esp PE-Texas Red:
select TubeTypesInstances., MIN(date) as min_date, MAX(date) as max_date, COUNT() as count from TubeCases INNER JOIN TubeTypesInstances USING (tube_type_instance) group by TubeTypesInstances.tube_type_instance order by count desc limit 40 ;
tube_type_instance tube_type Antigens min_date max_date count 1 Myeloid 1 CD117;CD13;CD15;CD19;CD33;CD34;CD38;CD45;CD71;LA-DR;Unknown 2005-11-17 14:43:26 2013-01-02 09:00:36 26862 7 B Cells New CD10;CD19;CD20;CD38;CD45;CD5;Kappa;Lambda;PE-Texas;Unknown 2008-10-23 17:08:13 2013-01-02 11:19:21 25579 3 Myeloid 2 CD123;CD13;CD14;CD16;CD34;CD38;CD4;CD45;CD64;LA-DR;Unknown 2006-07-15 11:15:32 2013-01-02 09:00:37 24020 5 T Cells New CD2;CD3;CD30;CD34;CD4;CD45;CD5;CD56;CD7;CD8;Unknown 2007-11-28 16:46:33 2013-01-02 11:19:21 21070 964 B cells rpt CD10;CD19;CD20;CD38;CD45;CD5;Kappa;Lambda;Unknown 2006-02-11 14:40:32 2009-08-20 09:22:47 18143 363 Myeloid 4 CD33;CD34;CD38;CD45;CD5;CD56;CD7;PE-Texas;Unknown 2006-06-14 12:43:37 2011-07-15 13:31:13 11017 949 T5 CD2;CD3;CD34;CD4;CD45;CD5;CD56;CD7;CD8;Unknown 2005-11-17 14:27:54 2009-01-24 16:39:45 10711 9 Plasma Cell NEW CD138;CD19;CD38;CD45;CD56;DAPI;PE-Texas;Unknown;cyto 2008-12-12 13:12:25 2012-12-31 15:37:12 5157 867 NEWa CD10;CD19;CD20;CD38;CD45;CD58;PE-Texas;Unknown 2006-04-19 17:33:19 2011-12-31 16:09:56 4239 111 COG B CD10;CD13+33;CD19;CD34;CD45;CD9;PE-Texas;Unknown 2006-12-12 15:25:16 2012-07-06 13:39:38 4232 1014 Plasma Cells NEW CD19;CD38;CD45;CD56;DAPI;PE-Texas;Unknown;cyto 2005-11-18 14:32:51 2009-03-03 18:48:58 3963 1015 WBC CD34;CD45;CD71;PE-Texas;Unknown 2005-11-18 11:06:42 2008-02-15 11:33:17 3643 4 Myeloid 4 CD33;CD34;CD38;CD45;CD5;CD56;CD7;PE-Texas;Pacific;Unknown 2011-07-13 15:18:31 2012-12-31 16:51:35 3047 317 B-ALL CD10;CD19;CD20;CD34;CD38;CD45;CD58;PE-Texas;Unknown 2008-11-20 13:15:52 2012-07-06 13:40:41 2905 316 D CD19;CD3;CD45;CD71;PE-Texas;SYTO16;Unknown 2006-12-12 15:28:46 2012-07-06 13:40:10 2698 14 B ALL MRD CD10;CD19;CD20;CD34;CD38;CD45;CD58;PE-Texas;Pacific;Unknown 2011-07-15 12:41:02 2013-01-02 08:54:03 2119 1017 B ALL MRD CD10;CD19;CD20;CD34;CD38;CD45;CD58;Unknown 2005-11-19 12:59:18 2008-11-20 17:54:51 1927 19 T4 CD16;CD3;CD38;CD45;CD5;CD56;CD7;PE-Texas;Unknown;cCD3 2007-12-17 14:40:18 2012-12-28 17:49:33 1743 630 PNH CD14;CD33;CD45;CD66b;PE-Texas;Unknown 2005-11-18 15:28:36 2010-07-31 12:38:50 1337 760 WBC CD34;CD45;CD71;DRAQ5;PE-Texas;Unknown 2005-11-17 14:42:56 2010-05-20 16:10:43 1245 668 addon CD10;CD19;CD20;CD38;CD40;CD45;CD5;Kappa;Lambda;Unknown 2005-11-17 14:27:31 2010-02-18 11:46:35 1216 1088 Myeloid 2 CD123;CD13;CD14;CD16;CD34;CD38;CD45;CD64;LA-DR;Unknown 2006-02-09 13:38:49 2008-03-15 12:49:19 1149 924 D8 CD10;CD19;CD20;CD34;CD45;PE-Texas;SYTO16;Unknown 2006-12-12 16:30:13 2011-12-24 14:06:28 1110 13 Other B cell CD103;CD11c;CD19;CD25;CD45;PE-Texas;Unknown 2005-11-17 14:58:40 2012-12-23 13:03:02 1007 2 Bone Marrow WBC CD16;CD33;CD38;CD45;CD71;PE-Texas;Unknown 2010-09-08 14:32:27 2012-01-17 13:25:08 852 448 T3 CD3;CD45;CD56;CD7;CD71;PE-Texas;SYTO16;Unknown 2007-01-26 19:21:53 2011-12-31 16:11:21 847 24 Hodgkin CD15;CD20;CD30;CD40;CD45;CD5;CD64;CD71;CD95;Unknown 2006-10-07 14:10:12 2013-01-02 14:46:51 665 11 NEW PNH WBC CD14;CD15;CD24;CD45;CD64;FLAER;PE-Texas;Unknown 2010-07-29 13:14:44 2012-12-30 11:26:40 650 8 BAL COUNT 7AAD;GLY;PE-Texas;Pacific;SYTO;Unknown 2011-10-28 17:03:33 2012-12-31 16:49:15 625 15 CLL TUBE CD19;CD200;CD23;CD5;FMC7;Pacific;Unknown 2011-07-13 17:13:52 2012-12-28 11:54:21 542 1012 Myeloid 2 CD123;CD13;CD14;CD16;CD34;CD36;CD38;CD45;CD64;LA-DR;Unknown 2005-11-17 14:43:46 2006-02-15 11:16:04 475 947 new TdT CD10;CD38;CD45;CD7;PE-Texas;TdT;Unknown;cCD3 2007-06-19 15:08:15 2008-06-06 13:47:51 444 10 NEW PNH RBC CD59;GlyA;PE-Texas;Pacific;Unknown 2011-07-14 11:10:46 2012-12-30 11:26:38 343 50 New MASTO CD117;CD2;CD25;CD45;PE-Texas;Pacific;Unknown 2012-01-17 15:41:09 2012-12-28 09:07:40 343 51 Neg MASTO CD117;CD45;PE-Texas;Pacific;Unknown 2012-01-17 15:41:46 2012-12-28 09:07:41 340 373 Bcl-2 addon Bcl2;CD10;CD19;CD20;CD38;CD45;CD5;PE-Texas;Unknown 2006-06-26 16:27:24 2011-07-16 12:02:12 317 372 BAL COUNT 7AAD;GLY;PE-Texas;SYTO;Unknown 2010-02-02 16:50:06 2011-07-11 14:30:13 316 86 T5 New CD16+56;CD2;CD3;CD34;CD4;CD45;CD5;CD7;CD8;Unknown 2009-01-31 14:50:29 2012-05-09 16:31:13 313 360 CLL TUBE CD19;CD200;CD23;CD5;FMC7;Unknown 2010-12-28 15:40:16 2011-07-15 10:30:45 291 368 NEW PNH RBC CD59;GlyA;PE-Texas;Unknown 2010-05-06 16:42:24 2011-07-14 14:54:47 284 — Reply to this email directly or view it on GitHub.
I'm not sure by what to clean up and what to document?
Input:
Database
Look at data after 2006.
Antigens to fix: select tube_type, Antigen, COUNT(*) as count from PmtTubeCases INNER JOIN TubeCases USING (case_tube) INNER JOIN TubeTypesInstances USING (tube_type_instance) WHERE tube_type LIKE 'Myeloid%' GROUP BY tube_type, Antigen ORDER BY count desc limit 50;
Fluorophores to fix: sqlite> select tube_type, fluorophore, COUNT(*) as count from PmtTubeCases INNER JOIN TubeCases USING (case_tube) INNER JOIN TubeTypesInstances USING (tube_type_instance) WHERE tube_type LIKE 'Myeloid%' GROUP BY tube_type, fluorophore ORDER BY count desc limit 50;
Addressed, but need to look at in new data
Lots of fluorophores are listed in the tube_type Antigen concatenation -- suggesting that parsing is not working for many antigen/fluoro's.
Need to make sure it is working for Myeloids. If is, handle this later.
Are these appropriate:
Antigen issues: A|684 Neg|81 neg|74 NEG|895
Fluorophore: A PE|1758
Neg means negative control; no antibody and no flurophore. I can add some code to make them all lower case. I think the should be kept as an "antigen" as these tubes are typically done paired with a tube where you are trying to measure any small level of expression.
"A" and "A PE" sounds like a parsing issue; can you tell me what tube that is coming from or other antigens in that tube?
On Sun, Jan 25, 2015 at 8:40 PM, Daniel Herman notifications@github.com wrote:
Are these appropriate:
Antigen issues: A|684 Neg|81 neg|74 NEG|895
Fluorophore: A PE|1758
— Reply to this email directly or view it on GitHub https://github.com/davidpng/FCS_Database/issues/16#issuecomment-71414139 .
It seems that our parsing method doesn't do a good job splitting antigens and fluorophores. Things seem to fall apart with a long tail of research and custom samples (Kappa and Lambda are funny, don't know what is up with that)
select fluorophore,count(*) as count from pmttubecases group by fluorophore order by count desc limit 50;