PUREE runs unresponsive

suhuanhou commented 7 months ago

R: write.csv(df_bulk, file = "df_bulk.csv", quote = FALSE, row.names = TRUE)

python: p = PUREE() purities_and_logs = p.get_output(dir_dataset.joinpath("df_bulk.csv"), 'HGNC')

PUREE runs unresponsive！ No information is output！

I uploaded the df_bulk.csv file to the server and soon had the result. So what's wrong with API?

Also, can the returned result use the original sample name?

erevkov commented 7 months ago

Hi, thank you for trying PUREE!

Unfortunately, I am not sure I understand the question, could you clarify it, please? The output looks correct - the purity is predicted for each sample, and the range looks appropriate. The fact that the method does not print out any explicit statement is the expected behavior, you could also check the logs for runtime messages.

For security reasons, the online method will only return anonymized sample names, but the samples are returned in the same order as in the input, so you could easily rename the resulting series with the original index. However, if you would prefer to run the method locally, you could obtain the internal source code of the method for academic use - please contact the corresponding author of the original paper for that.

suhuanhou commented 7 months ago

Hi, thank you for trying PUREE!

Unfortunately, I am not sure I understand the question, could you clarify it, please? The output looks correct - the purity is predicted for each sample, and the range looks appropriate. The fact that the method does not print out any explicit statement is the expected behavior, you could also check the logs for runtime messages.

For security reasons, the online method will only return anonymized sample names, but the samples are returned in the same order as in the input, so you could easily rename the resulting series with the original index. However, if you would prefer to run the method locally, you could obtain the internal source code of the method for academic use - please contact the corresponding author of the original paper for that.

Thank you for your answers!

I tried to use the API method and it ran without an error, but there was no result either. Then I used the same data and analyzed it online to get the results.

I want to be able to run PUREE through API methods.

erevkov commented 7 months ago

Could you please clarify further what kind of behavior you observe? The get_output method from the API should return the purities and the logs in a dictionary, e.g. after running the

p = PUREE()
purities_and_logs = p.get_output(expression_matrix_path, 'HGNC')

the resulting purities would be stored in the purities_and_logs['output'] . For example, could you please let us know what is returned in the purities_and_logs['output'] in your case?

suhuanhou commented 7 months ago

Could you please clarify further what kind of behavior you observe? The get_output method from the API should return the purities and the logs in a dictionary, e.g. after running the
p = PUREE()
purities_and_logs = p.get_output(expression_matrix_path, 'HGNC')
the resulting purities would be stored in the purities_and_logs['output'] . For example, could you please let us know what is returned in the purities_and_logs['output'] in your case?

Unfortunately, no information was generated, no log files. Only a blank tmp_dir directory is created

tanmay2893 commented 7 months ago

Hi,

The most plausible error you might be facing is the error which typically arises when there is a discrepancy between the gene identifier type and the data format utilized: 'ERROR: All of the selected genes are missing from the data, exiting... (Did you set the correct gene identifier type (HGNC or ENSEMBL)?)'

To assist in resolving the issue, we kindly ask you to check the following:

Please confirm that the gene identifier type you have provided is accurate for the dataset in use. The identifier type should correspond to the HGNC or ENSEMBL nomenclature as applicable to your data.
We ask that you inspect the purities_and_logs variable in your environment.

This variable should consist of two essential keys in case of successful output: output and logs. These keys are consistently present in our system when there is a successful run. For example:
This variable should consist of a tuple containing two elements like (False, 'ERROR: ...') in case of an unsuccessful run. For example:

Should you encounter any difficulties or require further clarification, do not hesitate to reach out.

suhuanhou commented 7 months ago

Thank you for your careful answers! But take note of what I've described.

No files were produced, no error messages, no log files, nothing.

When I analyzed the same data on the web, it worked, proving that there was no problem with the data format and the gene name.

tanmay2893 commented 7 months ago

Thank you for providing the output of purities_and_logs. We see that according to the returned tuple, the function fails due to the PosixPath object supplied as an input for the file path, whereas the .get_output method expects a string path input. Could you please try using a string path format such as '/home/user/...' as the input for the file path? Please don't hesitate to reach out if further assistance is needed.

tanmay2893 commented 3 months ago

We have not received any updates or further information on this issue for some time now. To keep our issue tracker focused and up-to-date, we will be closing this issue. If you believe this issue is still relevant or if you have additional details to provide, please feel free to reopen the issue.

skandlab / PUREE

PUREE runs unresponsive #10