Open wangzhenzZ opened 4 months ago
@wangzhenzZ it's normal to have more than 1 file in pkl_input
. It's because there are more than 100 utr region with available data from your BAM file.
Can you provide the file that caused the error?
If not, It seems that the error happened to the first object in the pickle file, so can you provide the overview of the first object by running the following command
import python
a = pickle.load("your_file_name.pkl")
print(a)
This is the file.
import pickle
with open('possorted_genome_bam.100.112.100.input.pkl', 'rb') as file:
a = pickle.load(file)
print(a)
('X:ENSSSCG00000012271:1:41803135-41803953:+', x l r pa cb_id read_id junction seg1_en seg2_en
0 22 120 NaN NaN 1052804 0 1 41803214.0 41803341.0
1 26 122 NaN NaN 443991 1 1 41803214.0 41803349.0
2 2 150 NaN NaN 492860 2 1 41803214.0 41803381.0
3 2 116 NaN NaN 325148 3 1 41803214.0 41803347.0
4 36 120 NaN NaN 502991 4 1 41803214.0 41803355.0
... ... ... .. ... ... ... ... ... ...
1545 475 35 NaN 509.0 591034 1545 0 NaN NaN
1546 476 43 NaN 518.0 20052 1546 0 NaN NaN
1547 480 39 NaN 518.0 685085 1547 0 NaN NaN
1548 484 43 NaN 526.0 841147 1548 0 NaN NaN
1549 516 33 NaN 548.0 167917 1549 0 NaN NaN
[1550 rows x 9 columns])
@wangzhenzZ It is possible for you to write this data to a separate pickle file and attach here so that I can do the debugging?
I noticed that this Issues board does not support uploading files in the pickle format. To facilitate the sharing of the necessary files, could you please provide me with your email address where I can send the pickle file?
@wangzhenzZ Oh you can attach the dataframe only as TSV or CSV file here.
@wangzhenzZ Oh you can attach the dataframe only as TSV or CSV file here.
@ThuyTien1 , do you have time to look into this issue?
@wangzhenzZ , i have just updated the package. Could you try reinstall the package and run your analysis again and see if the problems is solved?
@wangzhenzZ , i have just updated the package. Could you try reinstall the package and run your analysis again and see if the problems is solved?
Hi I also had the same problem. There is no output from infer_pa. I noticed this issue so I installed scape using conda. I am not sure if I installed the latest version of scape. But it doesn't work for me by now.
Best, Dongxu
Hi Dongxu,
@wangzhenzZ , i have just updated the package. Could you try reinstall the package and run your analysis again and see if the problems is solved?
Hi I also had the same problem. There is no output from infer_pa. I noticed this issue so I installed scape using conda. I am not sure if I installed the latest version of scape. But it doesn't work for me by now.
Best, Dongxu
@Dongxu-Zheng , I just tried it. It seems to be working. Here are the commands I use.
git clone https://github.com/chengl7-lab/scape.git
conda remove -n scape_env --all
conda env create -f mac_environment.yml
conda activate scape_env
cd ./scape/examples/toy-example
# here we use "test" as the output directory, takes 6 mins in my laptop
scape prepare_input --utr_file ./GRCh38_98.csv --cb_file ./barcodes.tsv.gz --bam_file ./example.bam --output_dir ./test --chunksize 100
# infer pA sites
scape infer_pa --pkl_input_file ./test/pkl_input/example.100.1.1.input.pkl --output_dir ./test
# remove spurious pA sites, the "--utr_merge" should be set to False if you are only interested in one UTR
scape merge_pa --output_dir ./test --utr_merge True
# Extract the pA counts, the result is in ./test/res.gene.cnt.tsv.gz
scape ex_pa_cnt_mat --output_dir ./test --res_pkl_file res.gene.pkl
Could you try the steps above and see if it works? If it dose not work, could you share your commands and error messages ?
best regards, Lu
There is only one pkl file in the
pkl_input
in the tutorialSCAPE-toy-example.ipynb
, but there are 112 pkl files in my data after runningscape prepare_input
. Is it normal? And how can I perform the next stepscape infer_pa
? I tried the following codes:And I got the error:
I've also attached my current environment: package-list.txt
Any suggestions will be appreciated.