rajewsky-lab / spacemake

Other
37 stars 11 forks source link

Missing headers in project_df.csv trigger error at startup #98

Closed flcvlr closed 8 months ago

flcvlr commented 9 months ago

Congratulations for the great code. I just run in a possible bug of the current (0.7.5) version of spacemake projects add_sample command.

I have generated the projects_df.csv using the spacemake projects add_sample command. This was successful. However when launching the spacemake run command I got the following error:

ValueError: Index project_id invalid

visual inspection on the projects_df.csv revealed that the header line was beginning with just two commas

,,puck_barcode_file_id,sample_sheet,species,demux_barcode_mismatch, [...]

after manually changing it to

project_id,,puck_barcode_file_id,sample_sheet,species,demux_barcode_mismatch, [...]

I did not get the project_id error anymore but rather the

ValueError: Index sample_id invalid

after adding the sample_id in the header,

project_id,sample_id,puck_barcode_file_id,sample_sheet,species,demux_barcode_mismatch, [...]

everything was fine.

There might have been an error from my side in running spacemake projects add_sample command, however it is also possible that for whatever reason the current (0.7.5) version is omitting to insert the first two felds of the header...

Valerio

danilexn commented 9 months ago

Hi Valerio,

thanks for such a detailed issue. I have installed spacemake from scratch, using the conda environment provided in the spacemake documentation, and using the conda environment you provided in a separate message. Then, I ran the following:

  1. Initialize spacemake

    spacemake init --dropseq_tools /data/rajewsky/shared_bins/Drop-seq_tools-2.5.1
  2. Adding human genome

    spacemake config add_species \
    --reference genome --name human \
    --sequence /data/rajewsky/genomes/GRCh38/release_38/GRCh38.primary_assembly.genome.fa \
    --annotation /data/rajewsky/genomes/GRCh38/release_38/gencode.v38.annotation.gtf
  3. Adding a sample

    spacemake projects add_sample \
    --project_id demo --sample_id S3 --species human \
    --puck openst --puck_barcode_file ~/data/projects/openst_paper/data/barcode_files/fc_1_*.txt.gz \
    --barcode_flavor openst \
    --run_mode openst \
    --R1 ~/data/projects/openst_paper/data/4_GEO_submit/metastatic_lymph_node_S3_R1_001.fastq.gz \
    --R2 ~/data/projects/openst_paper/data/4_GEO_submit/metastatic_lymph_node_S3_R2_001.fastq.gz

Opening the project_df.csv file, I see:

project_id,sample_id,puck_barcode_file_id,sample_sheet, [...]

which follows the expected formatting. Have you opened/written the csv file outside of spacemake? If not, what are the exact commands that led to this behavior? Apologies for the inconvenience, and thanks in advance!

Best, Dani

marvin-jens commented 8 months ago

Hi, I've seen a very similar issue before and it was caused by a broken pandas installation. Basically, there were two different versions installed at the same time (conda said version X but import pandas; pandas.__version__ said Y and X != Y). Removing and re-installing pandas helped in that case. I assume making a fresh conda env may also solve this. Please let us know if this resolves the issue.

flcvlr commented 8 months ago

Dear Marvin,

thanks, this was exactly the case indeed. I had a more recent pandas version in the beginning, which triggered an error in the spacemake. I uninstalled and reinstalled a pandas version consistent with spacemake requirements in the conda env, but did not refresh the conda env. Refreshing after the re-install of the appropriate pandas fixed it.

Thanks for your help,

Valerio

Il giorno lun 12 feb 2024 alle ore 10:45 Marvin Jens < @.***> ha scritto:

Hi, I've seen a very similar issue before and it was caused by a broken pandas installation. Basically, there were two different versions installed at the same time (conda said version X but import pandas; pandas.version said Y and X != Y). Removing and re-installing pandas helped in that case. I assume making a fresh conda env may also solve this. Please let us know if this resolves the issue.

— Reply to this email directly, view it on GitHub https://github.com/rajewsky-lab/spacemake/issues/98#issuecomment-1938333397, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMA5XKVWPNXUNDIZISXKG4TYTHQDVAVCNFSM6AAAAABDBPFFB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZYGMZTGMZZG4 . You are receiving this because you authored the thread.Message ID: @.***>