Open poddarharsh15 opened 1 month ago
Hi @Karenxzr I have tried several times with .csv format also please have a look, but i am still getting the same errors output.csv
python3 phenosv/model/phenosv.py --sv_file ~/structural_varinats/merged_vcfs/output.csv --target_folder test1/ --target_file_name Final_out
Traceback (most recent call last):
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 177, in <module>
main()
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 150, in main
pred = of.phenosv(None, None, None, None, sv_df, annotation_path, model, elements_path, feature_files, scaler_file,
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/../model/operation_function.py", line 552, in phenosv
if sv.shape[1]==5:
AttributeError: 'NoneType' object has no attribute 'shape'
Hi, I tested top 20 lines of your output.csv file and worked fine. please use absolute path for the path of --sv_file
. It seems PhenoSV did not read your input data correctly.
python3 phenosv/model/phenosv.py --sv_file /Users/zhuoranx/Documents/ResearchProject/PhenoSV/PhenoSV/data/test2.csv --target_folder /Users/zhuoranx/Documents/ResearchProject/PhenoSV/PhenoSV/data --target_file_name test_out
-target_folder /Users/zhuoranx/Documents/ResearchProject/PhenoSV/PhenoSV/dat
Hi @Karenxzr thank you for your fast response I have tried several runs using absolute path still gives the same errors please have a look :(( Do I need to use pip install .
python3 phenosv/model/phenosv.py --sv_file /home/tigem/h.poddar/structural_varinats/PhenoSV/data/output.csv --target_folder /home/tigem/h.poddar/structural_varinats/PhenoSV/data/ --target_file_name test1
Traceback (most recent call last):
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 177, in <module>
main()
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 150, in main
pred = of.phenosv(None, None, None, None, sv_df, annotation_path, model, elements_path, feature_files, scaler_file,
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/../model/operation_function.py", line 552, in phenosv
if sv.shape[1]==5:
AttributeError: 'NoneType' object has no attribute 'shape'
UPDATE: I found that the issue only occurs when processing the entire output.csv file, which contains almost 13,000 structural variants (SVs). When working with a subset of 30 lines from the same file, everything functions correctly without any errors. It seems the problem arises when handling a larger dataset. Could you please advise on possible solutions to address this?
test_run results:- test_out.csv
Hi, as mentioned in the tutorial, you can actually split up the input csv file and run multiple small csv files simultaneously. An example is as below. You can just increase the number of 4 threads to like 32 or so.
bash phenosv/model/phenosv.sh 'path/to/sv/data.csv' 'folder/path/to/store/results' 4 'HP:0000707,HP:0007598'
In addition, the source code is here: https://github.com/WGLab/PhenoSV/blob/main/phenosv/model/phenosv.sh
. If you use SLURM, you can split the input file as in the shell script and submit a job array.
One thing I am thinking is maybe there are some abnormal rows in your data caused this error. If you split the file, you might likely identify that observation.
Hi @Karenxzr, Do you have any suggestions for converting VCF files to CSV or BED formats? Currently, I am using vcf2bed to convert VCF files to BED format and then manipulating the data to create a CSV file, as shown in the sample data. Any advice or alternative approaches would be greatly appreciated. Thank you!
I have identified the issue with my input.csv file, which contained some unrecognized SVTYPE [i.e, ACGGGGCAGGGAGGGCCCCTCTAGAAGCCACCTGTGCAGAC
like this ] entries. After removing those and ensuring the CSV file only includes known SVTYPE, I am still encountering an error. Could you please suggest some ideas or solutions for this issue?
PS: However the PhenoSV
runs after emitting this error and generates a csv output with results, Please the csv file for reference.
Thank you in advance for your help!
command applied using SLURM
eval "$(conda shell.bash hook)"
conda activate phenosv
CONFIG_FILE="/home/tigem/h.poddar/structural_varinats/PhenoSV/input_files.txt"
TARGET_FOLDER="/home/tigem/h.poddar/structural_varinats/PhenoSV/final_test"
phenosvsh="/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.sh"
THREADS=64
mapfile -t INPUT_FILES < "$CONFIG_FILE"
SV_FILE="${INPUT_FILES[$SLURM_ARRAY_TASK_ID]}"
echo "Processing SV file: ${SV_FILE}"
echo "Target folder: ${TARGET_FOLDER}"
bash "${phenosvsh}" "${SV_FILE}" "${TARGET_FOLDER}" "${THREADS}" 'HP:0000707,HP:0007598'
echo "PhenoSV processing completed for ${SV_FILE}!"
Traceback (most recent call last):
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 177, in <module>
main()
File "/net/192.168.120.240/home/tigem/h.poddar/structural_varinats/PhenoSV/phenosv/model/phenosv.py", line 122, in main
sv_df.columns = ['CHR', 'START', 'END', 'ID', 'SVTYPE']
File "/home/tigem/h.poddar/miniconda3/envs/phenosv/lib/python3.10/site-packages/pandas/core/generic.py", line 5588, in __setattr__
return object.__setattr__(self, name, value)
File "pandas/_libs/properties.pyx", line 70, in pandas._libs.properties.AxisProperty.__set__
File "/home/tigem/h.poddar/miniconda3/envs/phenosv/lib/python3.10/site-packages/pandas/core/generic.py", line 769, in _set_axis
self._mgr.set_axis(axis, labels)
File "/home/tigem/h.poddar/miniconda3/envs/phenosv/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 214, in set_axis
self._validate_set_axis(axis, new_labels)
File "/home/tigem/h.poddar/miniconda3/envs/phenosv/lib/python3.10/site-packages/pandas/core/internals/base.py", line 69, in _validate_set_axis
raise ValueError(
ValueError: Length mismatch: Expected axis has 1 elements, new values have 5 elements
HI @Karenxzr
I'm experiencing issues with BED files when running the PhenoSV module, as illustrated in the attached errors. The errors are from bed files format probably, and I am unable to resolve them.
Could you please take a look and suggest possible solutions?
Thank you for your assistance!
python3 phenosv/model/phenosv.py --sv_file ~/structural_varinats/merged_vcfs/output.bed --target_folder test1/ --target_file_name Final_out
output.zip