xryanglab / SCASL

single-cell clustering based on alternative splicing landscapes
Apache License 2.0
10 stars 3 forks source link

Error encountered while running SCASL #1

Closed ssyshh closed 5 months ago

ssyshh commented 5 months ago

Hello, I'm glad to see you have created such meaningful work. However, I encountered a little trouble when running her, and I look forward to your solution I have configured the conda environment according to your configuration requirements file, but then I encountered the following error when running the sample data.

python main.py -y configs/srr_star_demo.yaml

image

I also run python main.py -y configs/srr_demo.yaml

image

kokox10 commented 5 months ago

Thank you for your interest. However, based solely on the information provided, it is difficult for me to accurately identify the problem. I can offer the following suggestions based on my personal experience:

  1. Please verify if the version of pandas is consistent with the requirements.txt file. Many errors stem from version conflicts with specific packages in the environment. In such cases, I recommend using Google Colab (https://colab.research.google.com/). Colab typically downloads the latest versions, minimizing the occurrence of errors.

  2. Please ensure that the format and content of the input file align with the specified requirements.

  3. To pinpoint the exact step where the error occurs, I suggest running the code line by line and examining the output results.

I hope these suggestions prove helpful in resolving the issue you are facing.

ssyshh commented 5 months ago

Thank you for your timely reply, it is really my software package version problem

hmgene commented 2 months ago

my solution (please inform me if this is your intend):


def to_prob(df, groupby):
    sums = df.groupby(groupby).sum(min_count=1)
    sums = pd.merge(df, sums, how='left', on=groupby)
    #HMK
    if( groupby=="start" ):
       sums = sums.drop(columns=['start', 'end_x', 'end_y'])
    else:
       sums = sums.drop(columns=['start_x','start_y','end'])
    #sums = sums.drop(columns=['start', 'end'])