biomap-research / scFoundation

Apache License 2.0
183 stars 27 forks source link

Bad Request, code: 400 #17

Closed lihan97 closed 3 months ago

lihan97 commented 5 months ago

When executing example.sh in the ./apiexample directory, the following error consistently occurs: begin request Namespace(data_path='./data/baron_human_samp_19264_fromsaver_demo.csv', input_type='singlecell', output_type='cell', pool_type='all', pre_normalized='F', save_path='./demo/Baron_demo/a5/', tgthighres='a5', token='biomapxxxxxxxxxxxxx', url='https://api.biomap.com/inference/xtrimogene', version='0.2') https://api.biomap.com/inference/xtrimogene {'Accept-Encoding': 'gzip,deflate,br', 'B-Authorization': 'biomapxxxxxxxxxxxx'} code:400 text:{"error":"Failed to process the request(s) for model instance 'xtrimogene_0', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:\n 1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)\n\nInvoked with: kwargs: error=<c_python_backend_utils.TritonError object at 0x7f9049781cf0>\n\nAt:\n /work/xtrimogene/service/xtrimogene/1/model.py(145): execute\n"} reason:Bad Request

lihan97 commented 5 months ago

I found this script working properly today. But shortly thereafter, it raised the following error: begin request Namespace(data_path='./data/gene_batch.npy', input_type='singlecell', output_type='gene', pool_type='all', pre_normalized='A', save_path='./demo/GEARS_demo_batch/f1/', tgthighres='f1', token='biomapxxxxxxxxxx', url='https://api.biomap.com/inference/xtrimogene', version='0.1') https://api.biomap.com/inference/xtrimogene {'Accept-Encoding': 'gzip,deflate,br', 'B-Authorization': 'biomapxxxxxxxxxxxxxxx'} code:400 text:{"error":"Failed to process the request(s) for model instance 'xtrimogene_0', message: RuntimeError: CUDA error: device-side assert triggered\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\n\nAt:\n /opt/conda/lib/python3.7/site-packages/torch/cuda/memory.py(114): empty_cache\n /work/xtrimogene/service/xtrimogene/1/model.py(73): execute\n"} reason:Bad Request

WhirlFirst commented 5 months ago

Hi lihan97, could you please give more information about what kind of data you used? Did you align the gene name and remove all abnormal gene expression data from the input?

lihan97 commented 5 months ago

I checked the input data format and ensured its consistency with the provided example data, successfully resolving the issue. Thank you for your suggestion!

I have an additional question: I want to obtain cell/gene embeddings for some downstream tasks. Does the selection of F/T/A for "--pre_normalized" influence the resulting embeddings? If so, which option is recommended?

WhirlFirst commented 5 months ago

Very happy to hear your successful inference!

Yes, this argument is decided by the input data format. For more details, plz refer to the readme in the apiexample folder. https://github.com/biomap-research/scFoundation/tree/main/apiexample#:~:text=is%20singlecell.-,pre_normalized,-%3A%20Controls%20the%20computation

When input_type is singlecell, T or F indicates if the input gene expression data is already normalized+log1p. A means data is normalized+log1p with the total count appended at the end, resulting in a data shape of N*19265. This mode is used for the GEARS task. For bulk input type, F means the T and S token values are log10(sum of gene expression), while T means they are the sum without log transformation. This is useful for bulk data with few sequenced genes.