Closed Matthias3033 closed 4 years ago
Hi @Matthias3033 ,
Can you list the databases you are using here? From the error, it sounds like there were no genes found in database that overlap with your data.
Hi @cflerin,
these are the databases that I use: FeatherRankingDatabase(name="hg19-tss-centered-10kb-10species.mc9nr"), FeatherRankingDatabase(name="hg19-tss-centered-10kb-7species.mc9nr"), FeatherRankingDatabase(name="hg19-tss-centered-5kb-10species.mc9nr"), FeatherRankingDatabase(name="hg19-500bp-upstream-7species.mc9nr"), FeatherRankingDatabase(name="hg19-tss-centered-5kb-7species.mc9nr"), FeatherRankingDatabase(name="hg19-500bp-upstream-10species.mc9nr")
The databases look fine (although there's no need to use the 7-species when also using 10-species, but it won't cause issues). Are you also using the correct motif annotations file (for human)? How many genes in your expression matrix? And how many modules do you have?
I am using the correct motif file. The number of genes is 17098. How do I get the number of modules? (with len(modules) I get 4996)
Just noticed:
186 except MemoryError:
187 LOGGER.error("Unable to process "{}" on database "{}" because ran out of memory.
which seems self-explanatory. You could try taking out the three 7-species databases and see if it works with the remaining databases.
Same error. I've also tried it with only one 7 species database - still the same error.
How much memory do you have available on your machine? You could try reducing the number of processes that pyscenic is using...
How can I reduce the number of processes?
Via CLI you have the parameter --num_workers N
where N specifies the number of cores to use. Using the API for Jupyter notebooks, a similar parameter is available.
For the prune2df function (cistarget step) the parameter name is num_workers. For grnboost, I kindely refer you to the arboreto package documentation: https://github.com/tmoerman/arboreto . Briefly, you need to use a construct like this:
from pyscenic.prune import _prepare_client
from arboreto import grnboost2
client, shutdown_callback = _prepare_client('local_host', num_workers=12)
network = grnboost2(expression_data=ex_mtx, tf_names=tf_names, verbose=True, client_or_address=client)
How much memory do you have available on your machine? You could try reducing the number of processes that pyscenic is using...
I ideally have 120 gb of RAM, so the memory should normally not be a problem
Hi,
I get the following error message when I use the function prune2df:
AssertionError Traceback (most recent call last)