Closed hanslovsky closed 11 months ago
Hi @hanslovsky, you can treat JCP2022_028373
as JCP2022_UNKNOWN
. JCP2022_028373
should have been removed from the wells.csv.gz
. We will fix that. Thanks for catching that!
Thank you @niranjchandrasekaran! Should JCP2022_028373
be removed completely in wells.csv.gz
, or replaced by JCP2022_UNKNOWN
?
I can make a PR with the change if that is helpful for you.
Hi @hanslovsky, thanks for offering to do that! We generate these metadata files using a couple of internal scripts. So we want to re-run them to keep everything reproducible. I will keep you posted. For the time being, you can ignore all wells with JCP2022_028373
.
That makes sense. I will mark JCP2022_028373
unknown on my side.
Hi @hanslovsky, wells.csv.gz
should now be fixed. Thanks for bringing this to our attention!
I am currently downloading metadata and images following #72 and then match the metadata from
load_data_with_illum.parquet
with the contents ofwell.csv.gz
for the JCP2022 id and then double check that JCP2022 id is valid viacompound.csv.gz
,crispr.csv.gz
, andorf.csv.gz
. I found thatJCP2022_028373
is inwell.csv.gz
but not in any ofcompound.csv.gz
,crispr.csv.gz
,.orf.csv.gz
. This is how you can reproduce it:As far as I can tell, all other JCPs in
well.csv.gz
have an according entry in one of the JCP files. I will treat this likeJCP2022_UNKNOWN
. Is there any chance that this missing JCP will be added to the metadata in the future?