AllenInstitute / cell_type_mapper

Repository for storing prototype functionality implementations for the BKP
Other
11 stars 2 forks source link

cell_type_mapper.cli.query_markers can't run if precomputed_stats isn't in cached directory #20

Closed scseeman closed 1 month ago

scseeman commented 1 month ago

I have precomputed_stats.h5 and reference_markers.h5 files that were shared with me. In trying to make the query_markers.json to run mapping I run into a problem where cell_type_mapper.cli.query_markers expects the precomputed_stats file to be in the location it was when it was generated via the cache in the reference_markers.h5 file. What are my options here? From what I can tell cell_type_mapper.cli.from_specifed_markers can't use the reference_markers.h5 file so I have to go through this step.

danielsf commented 1 month ago

For anyone else who finds this thread:

The context for this issue is that the query marker finder identifies the relevant precomputed_stats file by reading it from the metadata stored in the reference_marker file. This is done to prevent users from accidentally specifying reference_marker and precomputed_stats files that are not consistent. The failure identified in this issue arose when the file path stored for the precomputed_stats file in the reference_marker file was no longer valid due to the files being shared between users.

We have solved this problem by adding the command line argument --search_for_stats_file to the

python -m cell_type_mapper.cli.query_markers

tool. If run with --search_for_stats_file True, the query marker finder will, in the event that the precomputed_stats file is not in the location specified in the metadata of the reference_marker file, look for the precomputed_stats file in the same directory as the reference_marker file. The name of the precomputed_stats file is still read from the metadata of the reference_marker file, but its absolute path can be changed, so long as the refernce_marker file and precomputed_stats file are in the same directory.