Closed stevekm closed 3 years ago
there are some log and job files here to use for debugging this;
/juno/work/ci/helix_filters_01/test_data/11089_G
this is fixed for the moment, using the reduced Facets python dict method, if it becomes and problem again try some of these other ideas
also note that work to test mem usage was done here /juno/work/ci/kellys5/projects/benchmarking
The script
update_cBioPortal_data.py
is using large amounts of memory when merging Facets maf information into thedata_mutations_extended.txt
file.Some ideas;
join
command such as shown here https://unix.stackexchange.com/questions/113898/how-to-merge-two-files-based-on-the-matching-of-two-columns ; this will require pre-creating a single 'key' column, and both files will need to be sorted, also not sure how header comments might be handled