Closed shntnu closed 2 years ago
The older version will save a nan
as a string (nan
) whereas the latest version will save it as NULL
, which is the desired behavior.
Fixture created using cytominer-database 0.3.3: https://imaging-platform.s3.us-east-1.amazonaws.com/projects/2018_06_05_cmQTL/workspace/backend/2020_03_05_Batch6/cmQTLplate7-2-27-20/cmQTLplate7-2-27-20.sqlite
We know that Cells_Neighbors_AngleBetweenNeighbors_10 has some NAs
Export that column
sqlite3 ~/ebs_tmp/2018_06_05_cmQTL/workspace/backend/2020_03_05_Batch6/cmQTLplate7-2-27-20/cmQTLplate7-2-27-20.sqlite
.headers on
.mode csv
.output cmQTLplate7-2-27-20_Cells_Neighbors_AngleBetweenNeighbors_10.csv
select Cells_Neighbors_AngleBetweenNeighbors_10 from Cells;
.exit
Check how many empty lines
grep -v "\\." cmQTLplate7-2-27-20_Cells_Neighbors_AngleBetweenNeighbors_10.csv |grep -v Cells_Neighbors_AngleBetweenNeighbors_10|wc -l
> 198
Check if any nans
grep nan cmQTLplate7-2-27-20_Cells_Neighbors_AngleBetweenNeighbors_10.csv | wc -l
> 0
library(magrittr)
sqlite_file <- "~/ebs_tmp/2018_06_05_cmQTL/workspace/backend/2020_03_05_Batch6/cmQTLplate7-2-27-20/cmQTLplate7-2-27-20.sqlite"
db <- dplyr::src_sqlite(path = sqlite_file)
cells <- dplyr::tbl(src = db, "cells")
feature <- cells %>% dplyr::select(Cells_Neighbors_AngleBetweenNeighbors_10) %>% dplyr::collect()
sum(is.na(feature))
[1] 198
So everything lines up here
The VM mentioned in this manual currently has an older version of
cytominer-database
. Install the latest version, primarily to handlenan
s correctly.pip install --upgrade cytominer-database
See this PR and comment https://github.com/cytomining/cytominer-database/pull/104#issuecomment-511440383
cc @NasimJ @DavidStirling @bethac07